**Flemming Nielson David Sands (Eds.)**

# **Principles of Security and Trust**

**8th International Conference, POST 2019 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019 Prague, Czech Republic, April 6–11, 2019, Proceedings**

# Lecture Notes in Computer Science 11426

Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

### Editorial Board Members

David Hutchison, UK Josef Kittler, UK Friedemann Mattern, Switzerland Moni Naor, Israel Bernhard Steffen, Germany Doug Tygar, USA

Takeo Kanade, USA Jon M. Kleinberg, USA John C. Mitchell, USA C. Pandu Rangan, India Demetri Terzopoulos, USA

### Advanced Research in Computing and Software Science Subline of Lecture Notes in Computer Science

Subline Series Editors

Giorgio Ausiello, University of Rome 'La Sapienza', Italy Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board

Susanne Albers, TU Munich, Germany Benjamin C. Pierce, University of Pennsylvania, USA Bernhard Steffen, University of Dortmund, Germany Deng Xiaotie, Peking University, Beijing, China Jeannette M. Wing, Microsoft Research, Redmond, WA, USA More information about this series at http://www.springer.com/series/7410

Flemming Nielson • David Sands (Eds.)

# Principles of Security and Trust

8th International Conference, POST 2019 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019 Prague, Czech Republic, April 6–11, 2019 Proceedings

Editors Flemming Nielson Technical University of Denmark Kongens Lyngby, Denmark

David Sands Chalmers University of Technology Gothenburg, Sweden

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-17137-7 ISBN 978-3-030-17138-4 (eBook) https://doi.org/10.1007/978-3-030-17138-4

Library of Congress Control Number: 2019936300

LNCS Sublibrary: SL4 – Security and Cryptology

© The Editor(s) (if applicable) and The Author(s) 2019. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

### ETAPS Foreword

Welcome to the 22nd ETAPS! This is the first time that ETAPS took place in the Czech Republic in its beautiful capital Prague.

ETAPS 2019 was the 22nd instance of the European Joint Conferences on Theory and Practice of Software. ETAPS is an annual federated conference established in 1998, and consists of five conferences: ESOP, FASE, FoSSaCS, TACAS, and POST. Each conference has its own Program Committee (PC) and its own Steering Committee (SC). The conferences cover various aspects of software systems, ranging from theoretical computer science to foundations to programming language developments, analysis tools, formal approaches to software engineering, and security.

Organizing these conferences in a coherent, highly synchronized conference program enables participation in an exciting event, offering the possibility to meet many researchers working in different directions in the field and to easily attend talks of different conferences. ETAPS 2019 featured a new program item: the Mentoring Workshop. This workshop is intended to help students early in the program with advice on research, career, and life in the fields of computing that are covered by the ETAPS conference. On the weekend before the main conference, numerous satellite workshops took place and attracted many researchers from all over the globe.

ETAPS 2019 received 436 submissions in total, 137 of which were accepted, yielding an overall acceptance rate of 31.4%. I thank all the authors for their interest in ETAPS, all the reviewers for their reviewing efforts, the PC members for their contributions, and in particular the PC (co-)chairs for their hard work in running this entire intensive process. Last but not least, my congratulations to all authors of the accepted papers!

ETAPS 2019 featured the unifying invited speakers Marsha Chechik (University of Toronto) and Kathleen Fisher (Tufts University) and the conference-specific invited speakers (FoSSaCS) Thomas Colcombet (IRIF, France) and (TACAS) Cormac Flanagan (University of California at Santa Cruz). Invited tutorials were provided by Dirk Beyer (Ludwig Maximilian University) on software verification and Cesare Tinelli (University of Iowa) on SMT and its applications. On behalf of the ETAPS 2019 attendants, I thank all the speakers for their inspiring and interesting talks!

ETAPS 2019 took place in Prague, Czech Republic, and was organized by Charles University. Charles University was founded in 1348 and was the first university in Central Europe. It currently hosts more than 50,000 students. ETAPS 2019 was further supported by the following associations and societies: ETAPS e.V., EATCS (European Association for Theoretical Computer Science), EAPLS (European Association for Programming Languages and Systems), and EASST (European Association of Software Science and Technology). The local organization team consisted of Jan Vitek and Jan Kofron (general chairs), Barbora Buhnova, Milan Ceska, Ryan Culpepper, Vojtech Horky, Paley Li, Petr Maj, Artem Pelenitsyn, and David Safranek.

The ETAPS SC consists of an Executive Board, and representatives of the individual ETAPS conferences, as well as representatives of EATCS, EAPLS, and EASST. The Executive Board consists of Gilles Barthe (Madrid), Holger Hermanns (Saarbrücken), Joost-Pieter Katoen (chair, Aachen and Twente), Gerald Lüttgen (Bamberg), Vladimiro Sassone (Southampton), Tarmo Uustalu (Reykjavik and Tallinn), and Lenore Zuck (Chicago). Other members of the SC are: Wil van der Aalst (Aachen), Dirk Beyer (Munich), Mikolaj Bojanczyk (Warsaw), Armin Biere (Linz), Luis Caires (Lisbon), Jordi Cabot (Barcelona), Jean Goubault-Larrecq (Cachan), Jurriaan Hage (Utrecht), Rainer Hähnle (Darmstadt), Reiko Heckel (Leicester), Panagiotis Katsaros (Thessaloniki), Barbara König (Duisburg), Kim G. Larsen (Aalborg), Matteo Maffei (Vienna), Tiziana Margaria (Limerick), Peter Müller (Zurich), Flemming Nielson (Copenhagen), Catuscia Palamidessi (Palaiseau), Dave Parker (Birmingham), Andrew M. Pitts (Cambridge), Dave Sands (Gothenburg), Don Sannella (Edinburgh), Alex Simpson (Ljubljana), Gabriele Taentzer (Marburg), Peter Thiemann (Freiburg), Jan Vitek (Prague), Tomas Vojnar (Brno), Heike Wehrheim (Paderborn), Anton Wijs (Eindhoven), and Lijun Zhang (Beijing).

I would like to take this opportunity to thank all speakers, attendants, organizers of the satellite workshops, and Springer for their support. I hope you all enjoy the proceedings of ETAPS 2019. Finally, a big thanks to Jan and Jan and their local organization team for all their enormous efforts enabling a fantastic ETAPS in Prague!

February 2019 Joost-Pieter Katoen ETAPS SC Chair ETAPS e.V. President

### Preface

This volume contains the papers presented at POST 2019, the 8th Conference on Principles of Security and Trust, held April 11, 2019, in Prague, Czech Republic, as part of ETAPS. Principles of Security and Trust is a broad forum related to all theoretical and foundational aspects of security and trust, and thus welcomes papers of many kinds: new theoretical results, practical applications of existing foundational ideas, and innovative approaches stimulated by pressing practical problems; as well as systemization-of-knowledge papers, papers describing tools, and position papers. POST was created in 2012 to combine and replace a number of successful and long-standing workshops in this area: Automated Reasoning and Security Protocol Analysis (ARSPA), Formal Aspects of Security and Trust (FAST), Security in Concurrency (SecCo), and the Workshop on Issues in the Theory of Security (WITS). A subset of these events met jointly as an event affiliated with ETAPS 2011 under the name "Theory of Security and Applications" (TOSCA).

There were 27 submissions to POST 2019. Each submission was reviewed by at least three Program Committee members, who in some cases solicited the help of outside experts to review the papers. We employed a double-blind reviewing process with a rebuttal phase. Electronic discussion was used to decide which papers to select for the program. The committee decided to accept ten papers (37%). The papers are organized in topical sections named: Covert Channels and Information Flow; Privacy and Protocols; Distributed Systems.

We would like to thank the members of the Program Committee, the additional reviewers, the POST Steering Committee, the ETAPS Steering Committee, and the local Organizing Committee, who all contributed to the success of POST 2019. We also thank all authors of submitted papers for their interest in POST and congratulate the authors of accepted papers.

February 2019 Flemming Nielson David Sands

### Organization

### Program Committee

Owen Arden UC Santa Cruz, USA Veronique Cortier CNRS, Loria, France Fabio Martinelli IIT-CNR, Italy

Aslan Askarov Aarhus University, Denmark Musard Balliu KTH Royal Institute of Technology, Sweden Chiara Bodei University of Pisa, Italy Pierpaolo Degano University of Pisa, Italy Dieter Gollmann Hamburg University of Technology, Germany Joshua Guttman Worcester Polytechnic Institute, USA Justin Hsu University of Pennsylvania, USA Michael Huth Imperial College London, UK Heiko Mantel TU Darmstadt, Germany Flemming Nielson Technical University of Denmark, Denmark Christian W. Probst Unitec Institute of Technology, New Zealand Peter Ryan University of Luxembourg, Luxembourg Andrei Sabelfeld Chalmers University of Technology, Sweden David Sands Chalmers University of Technology, Sweden Carsten Schürmann IT University of Copenhagen, Denmark Alwen Tiu The Australian National University, Australia Mingsheng Ying University of Technology, Sydney, Australia

### Additional Reviewers

Bartoletti, Massimo Beckert, Bernhard Busi, Matteo Callia D'Iddio, Andrea Costa, Gabriele Dietrich, Sven Galletta, Letterio Gheri, Lorenzo Hamann, Tobias Isemann, Raphael Jacomme, Charlie Lallemand, Joseph Liu, Junyi Lluch Lafuente, Alberto Mercaldo, Francesco Mestel, David Miculan, Marino Pedersen, Mathias Rafnsson, Willard Rakotonirina, Itsaka Saracino, Andrea Vazquez Sandoval, Itzel Wang, Qisheng Weber, Alexandra Yautsiukhin, Artsiom Zheng, Haofan Zhou, Li

### Contents


# **Foundations for Parallel Information Flow Control Runtime Systems**

Marco Vassena1(B), Gary Soeller<sup>2</sup>, Peter Amidon<sup>2</sup>, Matthew Chan<sup>3</sup>, John Renner<sup>2</sup>, and Deian Stefan2(B)

> <sup>1</sup> Chalmers University, Gothenburg, Sweden vassena@chalmers.se <sup>2</sup> UC San Diego, San Diego, USA deian@cs.ucsd.edu <sup>3</sup> Awake Security, Sunnyvale, USA

**Abstract.** We present the foundations for a new dynamic information flow control (IFC) parallel runtime system, LIOPAR. To our knowledge, LIOPAR is the first dynamic language-level IFC system to (1) support deterministic parallel thread execution and (2) eliminate both internaland external-timing covert channels that exploit the runtime system. Most existing IFC systems are vulnerable to external timing attacks because they are built atop vanilla runtime systems that do not account for security—these runtime systems allocate and reclaim shared resources (e.g., CPU-time and memory) *fairly* between threads at different security levels. While such attacks have largely been ignored—or, at best, mitigated—we demonstrate that extending IFC systems with parallelism leads to the *internalization* of these attacks. Our IFC runtime system design addresses these concerns by hierarchically managing resources both CPU-time and memory—and making resource allocation and reclamation explicit at the language-level. We prove that LIOPAR is secure, i.e., it satisfies progress- and timing-sensitive non-interference, even when exposing clock and heap-statistics APIs.

### **1 Introduction**

Language-level dynamic information flow control (IFC) is a promising approach to building secure software systems. With IFC, developers specify applicationspecific, data-dependent security policies. The language-level IFC system—often implemented as a library or as part of a language runtime system—then enforces these policies automatically, by tracking and restricting the flow of information throughout the application. In doing so, IFC can ensure that different application components—even when buggy or malicious—cannot violate data confidentiality or integrity.

This work was supported in part by the CONIX Research Center, one of six centers in JUMP, a Semiconductor Research Corporation program sponsored by DARPA and by gifts from Cisco and Fujitsu. This work was partly done while Marco Vassena and Matthew Chan were at UCSD.

The key to making language-level IFC practical lies in designing real-world programming language features and abstractions without giving up on security. Unfortunately, many practical language features are at odds with security. For example, even exposing language features as simple as -statements can expose users to timing attacks [42,64]. Researchers have made significant strides towards addressing these challenges—many IFC systems now support real-world features and abstractions safely [10,15,20,34,43,50,51,54,55,59,60,62,67,68]. To the best of our knowledge, though, no existing language-level dynamic IFC supports parallelism. Yet, many applications rely on parallel thread execution. For example, modern Web applications typically handle user requests in parallel, on multiple CPU cores, taking advantage of modern hardware. Web applications built atop state-of-the-art dynamic IFC Web frameworks (e.g., Jacqueline [67], Hails [12,13], and LMonad [45]), unfortunately, do not handle user requests in parallel—the language-level IFC systems that underlie them (e.g., Jeeves [68] and LIO [54]) do not support parallel thread execution.

In this paper we show that extending most existing IFC systems—even concurrent IFC systems such as LIO—with parallelism is unsafe. The key insight is that most IFC systems *do not* prevent sensitive computations from affecting public computations; they simply prevent public computations from *observing* such sensitive effects. In the sequential and concurrent setting, such effects are only observable to attackers *external* to the program and thus outside the scope of most IFC systems. However, when computations execute in parallel, they are essentially external to one another and thus do not require an observer external to the system—they can observe such effects internally.

Consider a program consisting of three concurrent threads: two public threads—p<sup>0</sup> and p1—and a secret thread—s0. On a single core, language-level IFC can ensure that p<sup>0</sup> and p<sup>1</sup> do not learn anything secret by, for example, disallowing them from observing the return values (or lack thereof) of the secret thread. Systems such as LIO are careful to ensure that public threads cannot learn secrets even indirectly (e.g., via covert channels that abuse the runtime system scheduler). But, secret threads *can* leak information to an external observer that monitors public events (e.g., messages from public threads) by influencing the behavior of the public threads. For example, s<sup>0</sup> can terminate (or not) based on a secret and thus affect the amount of time p<sup>0</sup> and p<sup>1</sup> spend executing on the CPU—if s<sup>0</sup> terminated, the runtime allots the whole CPU to public threads, otherwise it only allots, say, two thirds of the CPU to the public threads; this allows an external attacker to trivially infer the secret (e.g., by measuring the rate of messages written to a public channel). Unfortunately, such *external timing attacks* manifest *internally* to the program when threads execute in parallel, on multiple cores. Suppose, for example, that p<sup>0</sup> and s<sup>0</sup> are co-located on a core and run in parallel to p1. By terminating early (or not) based on a secret, s<sup>0</sup> affects the CPU time allotted to p0, which can be measured by p1. For example, p<sup>1</sup> can count the number of messages sent from p<sup>0</sup> on a public channel—the number of p<sup>0</sup> writes indirectly leaks whether or not s<sup>0</sup> terminated.

We demonstrate that such attacks are feasible by building several proofof-concept programs that exploit the way the runtime system allocates and reclaims *shared* resources to violate LIO's security guarantees. Then, we design a new dynamic parallel language-level IFC runtime system called LIOPAR, which extends LIO to the parallel setting by changing how *shared* runtime system resources—namely CPU-time and memory—are managed. Ordinary runtime systems (e.g., GHC for LIO) *fairly* balance resources between threads; this means that allocations or reclamations for secret LIO threads directly affect resources available for public LIO threads. In contrast, LIOPAR makes resource management *explicit* and *hierarchical*. When allocating new resources on behalf of a thread, the LIOPAR runtime does not "fairly" steal resources from all threads. Instead, LIOPAR demands that the thread requesting the allocation explicitly gives up a portion of its own resources. Similarly, the runtime does not automatically relinquish the resources of a terminated thread—it requires the parent thread to explicitly reclaim them.

Nevertheless, automatic memory management is an integral component of modern language runtimes—high-level languages (e.g., Haskell and thus LIO) are typically garbage collected, relieving developers from manually reclaiming unused memory. Unfortunately, even if memory is hierarchically partitioned, some garbage collection (GC) algorithms, such as GHC's stop-the-world GC, may introduce timing covert channels [46]. Inspired by previous work on real-time GCs (e.g., [3,5,6,16,44,48]), we equip LIOPAR with a per-thread, interruptible garbage collector. This strategy is agnostic to the particular GC algorithm used: our hierarchical runtime system only demands that the GC runs within the memory confines of individual threads and their time budget.

In sum, this paper makes three contributions:


Neither our attack nor our defense is tied to LIO or GHC—we focus on LIO because it already supports concurrency. We believe that extending any existing language-level IFC system with parallelism will pose the same set of challenges challenges that can be addressed using explicit and hierarchical resource management.

### **2 Internal Manifestation of External Attacks**

In this section we give a brief overview of LIO and discuss the implications of shared, finite runtime system resources on security. We demonstrate several external timing attacks against LIO that abuse two such resources—the thread scheduler and garbage collector—and show how running LIO threads in parallel internalizes these attacks.

#### **2.1 Overview of the Concurrent LIO Information Flow Control System**

At a high level, the goal of an IFC system is to track and restrict the flow of information according to a security policy—almost always a form of *noninterference* [14]. Informally, this policy ensures *confidentiality*, i.e., secret data should not leak to public entities, and *integrity*, i.e., untrusted data should not affect trusted entities.

To this end, LIO tracks the flow of information at a coarse-granularity, by associating *labels* with threads. Implicitly, the thread label classifies all the values in its scope and reflects the sensitivity of the data that it has inspected. Indeed, LIO "raises" the label of a thread to accommodate for reading yet more sensitive data. For example, when a public thread reads secret data, its label is raised to secret—this reflects the fact that the rest of the thread computation may depend on sensitive data. Accordingly, LIO uses the thread's *current label* or *program counter label* to restrict its communication. For example, a secret thread can only communicate with other secret threads.

In LIO, developers can express programs that manipulate data of varying sensitivity—for example programs that handle both public and secret data—by forking multiple threads, at run-time, as necessary. However, naively implementing concurrency in an IFC setting is dangerous: concurrency can amplify and internalize the *termination covert channel* [1,58], for example, by allowing public threads to observe whether or not secret threads terminated. Moreover, concurrency often introduces *internal timing covert channels* wherein secret threads leak information by influencing the scheduling behavior of public threads. Both classes of covert channels are high-bandwidth and easy to exploit.

Stefan et al. [54] were careful to ensure that LIO does not expose these termination and timing covert channels *internally*. LIO ensures that even if secret threads terminate early, loop forever, or otherwise influence the runtime system scheduler, they cannot leak information to public threads. But, secret threads *do* affect public threads with those actions and thus expose timing covert channels *externally*—public threads just cannot detect it. In particular, LIO disallows public threads from (1) directly inspecting the return values (and thus timing and termination behavior) of secret threads, without first raising their program counter label, and (2) observing runtime system resource usage (e.g., elapsed time or memory availability) that would indirectly leak secrets.

LIO prevents public threads from measuring CPU-time usage directly— LIO does not expose a clock API—and indirectly—threads are scheduled fairly in a round-robin fashion [54]. Similarly, LIO prevents threads from measuring memory usage directly—LIO does not expose APIs for querying heap statistics—and indirectly, through garbage collection cycles (e.g., induced by secret threads) [46]—GHC's stop-the-world GC stops all threads. Like other IFC systems, the security guarantees of LIO are weaker in practice because its formal model does not account for the GC and assumes memory to be infinite [54,55].

#### **2.2 External Timing Attacks to Runtime Systems**

Since secret threads can still influence public threads by abusing the scheduler and GC, LIO is vulnerable to *external timing and termination attacks*, i.e., attacks that leak information to external observers. To illustrate this, we craft several LIO programs consisting of two threads: a public thread p that writes to the external channel observed by the attacker and a secret thread s, which abuses the runtime to influence the throughput of the public thread. The secret thread can leak in many ways, for example, thread s can:


These attacks abuse the runtime's automatic *allocation* and *reclamation* of shared resources, i.e., CPU time and memory. In particular, attack 1 hinges on the runtime *allocating* CPU time for the new secret threads, thus reducing the CPU time allotted to the public thread. Dually, attack 2 relies on it *reclaiming* the CPU time of terminated threads—it reassigns it to public threads. Similarly, attacks 3 and 4 force the runtime to allocate all the available memory and preemptively reassign CPU time to the GC, respectively.

These attacks are not surprising, but, with the exception of the GCbased attack [46], they are novel in the IFC context. Moreover these attacks are not exhaustive—there are other ways to exploit the runtime system nor optimized—our implementation leaks sensitive data at a rate of roughly 2bits/second<sup>1</sup>. Nevertheless, they are feasible and—because they abuse the runtime—they are effective against language-level external-timing mitigation techniques, including [54,71]. The attacks are also feasible on other systems similar attacks that abuse the GC have been demonstrated for both the V8 and JVM runtimes [46].

<sup>1</sup> A more assiduous attacker could craft similar attacks that leak at higher bit-rates.


**Fig. 1.** In this attack three threads run in parallel, colluding to leak secret secret. The two public threads write to a *public* output channel; the relative number of messages written on the channel by each thread directly leaks the secret (as inferred by p1). To affect the rate that p<sup>0</sup> can write, s<sup>0</sup> conditionally terminates—which will free up time on core c<sup>0</sup> for p<sup>0</sup> to execute.

#### **2.3 Internalizing External Timing Attacks**

LIO, like almost all IFC systems, considers external timing out of scope for its attacker model. Unfortunately, when we run LIO threads on multiple cores, in parallel, the allocation and reclamation of resources on behalf of secret threads is indirectly observable by public threads. Unsurprisingly, some of the above external timing attacks manifest internally—a thread running on a parallel core acts as an "external" attacker. To demonstrate the feasibility of such attacks, we describe two variants of the aforementioned scheduler-based attacks which leak sensitive information internally to public threads.

Secret threads can leak information by relinquishing CPU time, which the runtime reclaims and *unsafely* redistributes to public threads running on the same core. Our attack program consists of three threads: two public threads—p<sup>0</sup> and p1—and a secret thread—s0. Figure 1 shows the pseudo-code for this attack. Note that the threads are secure in isolation, but leak the value of secret when executed in parallel, with a round robin scheduler. In particular, threads p<sup>0</sup> and s<sup>0</sup> run concurrently on core c<sup>0</sup> using half of the CPU time each, while p<sup>1</sup> runs in parallel alone on core c<sup>1</sup> using all the CPU time. Both public threads repeatedly write their respective thread IDs to a *public channel*. The secret thread, on the other hand, loops forever or terminates depending on secret. Intuitively, when the secret thread terminates, the runtime system redirects its CPU time to p0, causing both p<sup>1</sup> and p<sup>0</sup> to write at the same rate. In converse, when the secret thread does not terminate early, p<sup>0</sup> is scheduled in a round-robin fashion with s<sup>0</sup> on the same core and can thus only write half as fast as p1. More specifically:



Secret LIO threads can also leak information by allocating many secret threads on a core with public threads—this reduces the CPU-time available to the public threads. For example, using the same setting with three threads from before, the secret thread forks a spinning thread on core c<sup>1</sup> by replacing command terminate with command fork (forever skip) c<sup>1</sup> in the code of thread s<sup>0</sup> in Fig. 1. Intuitively, if secret is false, then p<sup>1</sup> writes more often than p<sup>0</sup> before, otherwise the write rate of p<sup>1</sup> decreases—it shares core c<sup>1</sup> with the child thread of s0—and p<sup>0</sup> writes as often as p1.

Not all external timing attacks can be internalized, however. In particular, GHC's approach to reclaiming memory via a stop-the-world GC simultaneously stops all threads on *all* cores, thus the relative write rate of public threads remain constant. Interestingly, though, implementing LIO on runtimes (e.g., Node.js as proposed by Heule et al. [17]) with modern parallel garbage collectors that do not always stop the world would internalize the GC-based external timing attacks. Similarly, abusing GHC's memory allocation to exhaust all memory crashes all the program threads and, even though it cannot be internalized, it still results in information leakage.

### **3 Secure, Parallel Runtime System**

To address the external and internal timing attacks, we propose a new dynamic IFC runtime system design. Fundamentally, today's runtime systems are vulnerable because they automatically allocate and reclaim resources that are shared across threads of varying sensitivity. However, the automatic allocation and reclamation is not in itself a problem—it is only a problem because the runtime steals (and grants) resources from (and to) differently-labeled threads.

Our runtime system, LIOPAR, explicitly partitions CPU-time and memory among threads—each thread has a fixed CPU-time and memory *budget* or *quota*. This allows resource management decisions to be made locally, for each thread, independent of the other threads in the system. For example, the runtime scheduler of LIOPAR relies on CPU-time partitioning to ensure that threads always run for a fixed amount of time, irrespective of the other threads running on the same core. Similarly, in LIOPAR, the memory allocator and garbage collector rely on memory partitioning to be able to allocate and collect memory on behalf of a thread without being influenced or otherwise influencing other threads in the system. Furthermore, partitioning resources among threads enables fine-grained control of resources: LIOPAR exposes secure primitives to (i) measure resource usage (e.g., time and memory) and (ii) elicit garbage collection cycles.

<sup>2</sup> The attacker needs to empirically find parameter n, so that p<sup>1</sup> writes roughly twice as much as thread p<sup>0</sup> with half CPU time on core c0.

The LIOPAR runtime does not automatically balance resources between threads. Instead, LIOPAR makes resource management explicit at the language level. When forking a new thread, for example, LIOPAR demands that the parent thread give up part of its CPU-time and memory budgets to the children. Indeed, LIOPAR even manages core ownership or *capabilities* that allow threads to fork threads across cores. This approach ensures that allocating new threads does not indirectly leak any information externally or to other threads. Dually, the LIOPAR runtime does not re-purpose unused memory or CPU-time, even when a thread terminates or "dies" abruptly—parent threads must explicitly kill their children when they wish to reclaim their resources.

To ensure that CPU-time and memory can always be reclaimed, LIOPAR allows threads to kill their children at any time. Unsurprisingly, this feature requires restricting the LIOPAR floating-label approach more than that of LIO— LIOPAR threads cannot raise their current label if they have already forked other threads. As a result, in LIOPAR threads form a *hierarchy*—children threads are always at least as sensitive as their parent—and thus it is secure to expose an API to *allocate* and *reclaim* resources.

**Attacks Revisited.** LIOPAR enforces security against *reclamation-based attacks* because secret threads cannot automatically relinquish their resources. For example, our hierarchical runtime system stops the attack in Fig. 1: even if secret thread s<sup>0</sup> terminates (secret = true), the throughput of public thread p<sup>0</sup> remains constant—LIOPAR does not reassign the CPU time of s<sup>0</sup> to p0, but keeps s<sup>0</sup> spinning until it gets killed. Similarly, LIOPAR protects against *allocationbased attacks* because secret threads cannot steal resources owned by other public threads. For example, the *fork-bomb* variant of the previous attack fails because LIOPAR aborts command fork (forever skip) c1—thread s<sup>0</sup> does not own the core capability c1—and thus the throughput of p<sup>1</sup> remains the same. In order to substantiate these claims, we first formalize the design of the *hierarchical* runtime system (Sect. 4) and establish its security guarantees (Sect. 5).

**Trust Model.** This work addresses attacks that exploit runtime system resource management — in particular memory and CPU-time. We do not address attacks that exploit other shared runtime system state (e.g., event loops [63], lazy evaluation [7,59]), shared operating system state (e.g., file system locks [24], events and I/O [22,32]), or shared hardware (e.g., caches, buses, pipelines and hardware threads [11,47]) Though these are valid concerns, they are orthogonal and outside the scope of this paper.

### **4 Hierarchical Calculus**

In this section we present the formal semantics of LIOPAR. We model LIOPAR as a security monitor that executes simply typed λ-calculus terms extended with *LIO* security primitives on an abstract machine in the style of Sestoft [53]. The security monitor reduces secure programs and aborts the execution of leaky programs.

**Semantics.** The state of the monitor, written (Δ, *pc*, N | *t*, *S*), stores the state of a thread under execution and consists of a heap Δ that maps variables to terms, the thread's program counter label *pc*, the set N containing the identifiers of the thread's children, the term currently under reduction *t* and a stack of continuations *S*. Figure 2 shows the interesting rules of the sequential small-step operational semantics of the security monitor. The notation s <sup>μ</sup> *s* denotes a transition of the machine in state s that reduces to state *s* in one step with thread parameters μ = (h, *cl*).<sup>3</sup> Since we are interested in modeling a system with *finite* resources, we parameterize the transition with the maximum heap size <sup>h</sup> <sup>∈</sup> <sup>N</sup>. Additionally, the clearance label *cl* represents an upper bound over the sensitivity of the thread's floating counter label *pc*. Rule [App1] begins a function application. Since our calculus is call-by-name, the function argument is saved as a *thunk* (i.e., an unevaluated expression) on the heap at fresh location *x* and the indirection is pushed on the stack for future lookups.<sup>4</sup> Note that the rule allocates memory on the heap, thus the premise |Δ| < h forbids a heap overflow, where the notation |Δ| denotes the size of the heap Δ, i.e., the number of bindings that it contains.<sup>5</sup> To avoid overflows, a thread can measure the size of its own heap via primitive *size* (Sect. 4.2). If *t*<sup>1</sup> evaluates to a function, e.g., λ*y*.*t*, rule [App2] starts evaluating the body, in which the bound variable *y* is substituted with the heap-allocated argument *x* , i.e., *t* [*x* / *y* ]. When the evaluation of the function body requires the value of the argument, variable *x* is looked up in the heap (rule [Var]). In the next paragraph we present the rules of the basic security primitives. The other sequential rules are available in the extended version of this paper.

**Security Primitives.** A labeled value *Labeled t*◦ of type *Labeled* τ consists of term *t* of type τ and a label , which reflects the sensitivity of the content. The annotation *t*◦ denotes that term *t* is *closed* and does not contain any free variable, i.e., fv(*t*) = ∅. We restrict the syntax of labeled values with closed terms for security reasons. Intuitively, LIOPAR allocates free variables inside a secret labeled values on the heap, which then leaks information to public threads with its size. For example, a public thread could distinguish between two secret values, e.g., *Labeled H x* with heap Δ = [*x* → 42], and *Labeled H* 0 with heap Δ = ∅, by measuring the size of the heap. To avoid that, labeled values are closed and the size of the heap of a thread at a certain security level, is not affected by data labeled at different security levels. A term of type *LIO* τ is a secure computation that performs side effects and returns a result of type τ . Secure computations are structured using standard monadic constructs *return t*, which embeds term *t* in the monad, and *bind*, written *t*1>>=*t*2, which sequentially

<sup>3</sup> We use record notation, i.e., μ.h and μ.*cl*, to access the components of <sup>μ</sup>. <sup>4</sup> The calculus does not feature lazy evaluation. Laziness, because of *sharing*, introduces a covert channel, which has already been considered in previous work [59].

<sup>5</sup> To simplify reasoning, our generic memory model is basic and assumes a uniform size for all the objects stored in the heap. We believe that it is possible to refine our generic model with more accurate memory models (e.g., GHC's tagless G-machine (STG) [23], the basis for GHC's runtime [39]), but leave this to future work.

composes two monadic actions, the second of which takes the result of the first as an argument. Rule [Bind1] deconstructs a computation *t*1>>=*t*<sup>2</sup> into term *t*<sup>1</sup> to be reduced first and pushes on the stack the continuation >>=*t*<sup>2</sup> to be invoked after term *t*1. <sup>6</sup> Then, the second rule [Bind2] pops the topmost continuation placed on the stack (i.e., >>=*t*2) and evaluates it with the result of the first computation (i.e., *t*<sup>2</sup> *t*1), which is considered complete when it evaluates to a monadic value, i.e., to syntactic form *return t*1. The runtime monitor secures the interaction between computations and labeled values. In particular, secure computations can construct and inspect labeled values exclusively with monadic primitives *label* and *unlabel* respectively. Rules [Label1] and [Unlabel1] are straightforward and follow the pattern seen in the other rules. Rule [Label2] generates a labeled value at security level , subject to the constraint *pc cl*, which prevents a computation from labeling values below the program counter label *pc* or above the clearance label *cl*. <sup>7</sup> The rule computes the closure of the content, i.e., closed term *t*◦, by recursively substituting every free variable in term *t* with its value in the heap, written Δ<sup>∗</sup> (*t*). Rule [Unlabel2] extracts the content of a labeled value and taints the program counter label with its label, i.e., it rises it to *pc* , to reflect the sensitivity of the data that is now in scope. The premise *pc cl* ensures that the program counter label does not float over the clearance *cl*. Thus, the run-time monitor prevents the program counter label from floating above the clearance label (i.e., *pc cl* always holds).

The calculus also includes concurrent primitives to allocate resources when forking threads (*fork* and *spawn* in Sect. 4.1), reclaim resources and measure resource usage (*kill*, *size*, and *time* in Sect. 4.2), threads synchronization and communication (*wait*, *send* and *receive* in the extended version of this paper).

#### **4.1 Core Scheduler**

In this section, we extend LIOPAR with concurrency, which enables (i) *interleaved* execution of threads on a single core and (ii) *simultaneous* execution on κ cores. To protect against attacks that exploit the automatic management of shared *finite* resource (e.g., those in Sect. 2.3), LIOPAR maintains a resource budget for each running thread and updates it as threads allocate and reclaim resources. Since κ threads execute at the same time, those changes must be coordinated in order to preserve the consistency of the resource budgets and guarantee *deterministic parallelism*. For this reason, the hierarchical runtime system is split in two components: (i) the *core scheduler*, which executes threads on a single core, ensures that they respect their resource budgets and performs security checks, and (ii) the top-level *parallel scheduler*, which synchronizes the execution on multiple cores and reassigns resources by updating the resource budgets according to the instructions of the core schedulers. We now introduce the core scheduler and describe the top-level parallel scheduler in Sect. 4.3.

<sup>6</sup> Even though the stack size is unbounded in this model, we could account for its memory usage by explicitly allocating it on the heap, in the style of Yang et al. [66].

<sup>7</sup> The labels form a security lattice (*<sup>L</sup>* , -, ).

**Fig. 3.** Concurrent LIOPAR.

**Syntax.** Figure 3 presents the core scheduler, which has access to the global state Σ = (T,B, *H* , θ, ω), consisting of a thread pool map T, which maps a thread id to the corresponding thread's current state, the time budget map B, a memory budget map *H* , core capabilities map θ, and the global clock ω. Using these maps, the core scheduler ensures that thread *n*: (i) performs B(*n*) uninterrupted steps until the next thread takes over, (ii) does not grow its heap above its maximum heap size *H* (*n*), and (iii) has exclusive access to the *free* core capabilities θ(*n*). Furthermore, each thread id *n* records the *initial* current label when the thread was created (*n*.*pc*), its clearance (*n*.*cl*), and the core where it runs (*n*.*k*), so that the runtime system can enforce security. Notice that thread ids are *opaque* to threads—they cannot forge them nor access their fields.

**Hierarchical Scheduling.** The core scheduler performs *deterministic* and *hierarchical* scheduling—threads lower in the hierarchy are scheduled first, i.e., parent threads are scheduled before their children. The scheduler manages a core run queue *Q*, which is structured as a binary tree with leaves storing thread ids and residual time budgets. The notation *n<sup>b</sup>* indicates that thread *n* can run for *b* more steps before the next thread runs. When a new thread is spawned, the scheduler creates a subtree with the parent thread on the left and the child on the right. The scheduler can therefore find the thread with the highest priority by following the left spine of the tree and backtracking to the right if a thread has no residual budget.<sup>8</sup> We write *<sup>Q</sup>*[*n<sup>b</sup>*] to mean the first thread encountered via this traversal is *<sup>n</sup>* with budget *<sup>b</sup>*. As a result, given the slice *<sup>Q</sup>*[*n*1+*<sup>b</sup>*], thread *<sup>n</sup>* is the next thread to run, and *<sup>Q</sup>*[*n*<sup>0</sup>] occurs only if *all* threads in the queue have zero residual budget. We overload this notation to represent tree updates: a rule *<sup>Q</sup>*[*n*1+*<sup>b</sup>*] <sup>→</sup> *<sup>Q</sup>*[*n<sup>b</sup>*] finds the next thread to run in queue *<sup>Q</sup>* and decreases its budget by one.

**Semantics.** Figure <sup>3</sup> formally defines the transition *<sup>Q</sup>* (*n*,s,*e*) −−−−→<sup>Σ</sup> *<sup>Q</sup>* , which represents an execution step of the *core scheduler* that schedules thread *n* in core queue *Q*, executes it with global state Σ = (T,B, *H* , θ, ω) and updates the queue to *Q* . Additionally, the core scheduler informs the parallel scheduler of the final state s of the thread and requests on its behalf to update the global state by means of event message *e*. In rule [Step], the scheduler retrieves the next thread in the schedule, i.e., *<sup>Q</sup>*[*n*1+*<sup>b</sup>*] and its state in the thread pool from the global state, i.e., Σ.T(*n*) = s. Then, it executes the thread for one sequential step with its memory budget and clearance, i.e., s <sup>μ</sup> *s* with μ = (*Σ*.*H* (*n*), *n*.*cl*), sends the empty event to the parallel scheduler, and decrements the thread's residual budget in the final queue, i.e., *<sup>Q</sup>*[*n<sup>b</sup>*]. In rule [Fork], thread *n* creates a new thread *t* with initial label <sup>L</sup> and clearance H, such that <sup>L</sup> <sup>H</sup> and *pc* L. The child thread runs on the same core of the parent thread, i.e., *n*.*k*, with fresh id *n* , which is then added to the set of children, i.e., {*n* } ∪ N. Since parent and child threads do not share memory, the core scheduler must copy the portion of the parent's private heap reachable by the child's thread, i.e., Δ ; we do this by copying the bindings of the variables that are transitively reachable from *t*, i.e., fv<sup>∗</sup> (*t*, Δ), from the parent's heap Δ. The parent thread gives h<sup>2</sup> of its memory budget *Σ*.*H* (*n*) to its child. The conditions |Δ| h<sup>1</sup> and |Δ | h2, ensure that the heaps do not overflow their new budgets. Similarly, the core scheduler splits the residual time budget of

<sup>8</sup> When implemented, this procedure might introduce a timing channel that leaks the number of threads running on the core. In practice, techniques from real time schedulers can be used to protect against such timing channels. The model of LIOPAR does not capture the execution time of the runtime system itself and thus this issue does not arise in the security proofs.

the parent into *b*<sup>1</sup> and *b*<sup>2</sup> and informs the parallel scheduler about the new thread and its resources with event **fork**(Δ , *n* ,*t*, *b*2, h2), and lastly updates the tree *<sup>Q</sup>* by replacing the leaf *n*1+*b*1+*b*<sup>2</sup> with the two-leaves tree *nb*<sup>1</sup> |*nb*<sup>2</sup> , so that the child thread will be scheduled immediately after the parent has consumed its remaining budget *b*1, as explained above. Rule [Spawn] is similar to [Fork], but consumes core capability resources instead of time and memory. In this case, the core scheduler checks that the parent thread owns the core where the child is scheduled and the core capabilities assigned to the child, i.e., θ(*n*) = {*k* } ∪ K<sup>1</sup> ∪ K<sup>2</sup> for some set K2, and informs the parallel scheduler with event **spawn**(Δ , *n* ,*t*, K1). Rule [Stuck] performs busy waiting by consuming the time budget of the scheduled thread, when it is *stuck* and cannot make any progress—the premises of the rule enumerate the conditions under which this can occur (see the extended version of this paper for details). Lastly, in rule [ContextSwitch] all the threads scheduled in the core queue have consumed their time budget, i.e., *<sup>Q</sup>*[*n*<sup>0</sup>] and the core scheduler resets their residual budget using the budget map Σ.B. In the rule, the notation *<sup>Q</sup>*[*n<sup>b</sup>* <sup>i</sup> ] selects the i-th leaf, where i ∈ {1 . . |*Q*|} and |*Q*| denotes the number of leaves of tree *Q* and symbol ◦ denotes the thread identifier of the core scheduler, which updates a dummy thread that simply spins during a context-switch or whenever the core is unused.

#### **4.2 Resource Reclamation and Observations**

The calculus presented so far enables threads to manage their time, memory and core capabilities hierarchically, but does not provide any primitive to reclaim their resources. This section rectifies this by introducing (i) a primitive to kill a thread and return its resources back to the owner and (ii) a primitive to elicit a garbage collection cycle and reclaim unused memory. Furthermore, we demonstrate that the runtime system presented in this paper is robust against timing attacks by exposing a timer API allowing threads to access a global clock.<sup>9</sup> Intuitively, it is secure to expose this feature because LIOPAR ensures that the time spent executing high threads is fixed in advanced, so timing measurements of low threads remain unaffected. Lastly, since memory is hierarchically partitioned, each thread can securely query the current size of its *private heap*, enabling fine-grained control over the garbage collector.

**Kill.** A parent thread can reclaim the resources given to its child thread *n* , by executing *kill n* . If the child thread has itself forked or spawned other threads, they are transitively killed and their resources returned to the parent thread. The concurrent rule [Kill2] in Fig. 4 initiates this process, which is completed by the parallel scheduler via event **kill**(*n* ). Note that the rule applies only when the thread killed is a *direct* child of the parent thread—that is when the parent's children set has shape {*n* } ∪ N for some set N. Now that threads can unrestrictedly reclaim resources by killing their children, we must revise the primitive

<sup>9</sup> An *external* attacker can take timing measurements using network communications. An attacker equipped with an *internal* clock is equally powerful but simpler to formalize [46].

$$\begin{array}{ll} \text{Kul}\_{12} \\ \hline \Sigma, T(n) = (\Delta, pc, \{n'\} \cup N \mid n', kill : S) & s = (\Delta, pc, N \mid return \ (), S) \\ \hline \end{array}$$

$$\begin{array}{ll} \text{UNLABEL}\_{2} \\ \hline \end{array} \begin{array}{ll} \text{UNLABEL}\_{2} \\ \hline \end{array} \begin{array}{ll} p.c.l \in \mu.c.l \quad \forall n \in N \ \mu.p. \ \Box \ell \subseteq \mathit{n.p.e} \\ \hline \end{array} \begin{array}{ll} p.c.l \in \mu.c.l \quad \forall n \in N \ \mu.p. \ \Box \ell \subseteq \mathit{n.p.e} \\ \hline \end{array}$$

$$\begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} = \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{GC} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\ \hline \end{array} \begin{array}{ll} \text{G} \\$$

**Fig. 4.** LIOPAR with resource reclamation and observation primitives.

*unlabel*, since the naive combination of *kill* and *unlabel* can result in information leakage. This will happen if a public thread forks another public thread, then reads a secret value (raising its label to secret), and based on that decides to kill the child. To close the leak, we modify the rule [Unlabel2] by adding the highlighted premise, causing the primitive *unlabel* to fail whenever the parent thread's label would float above the *initial* current label of one of its children.

**Garbage Collection.** Rule [GC] extends LIOPAR with a *time-sensitive hierarchical* garbage collector via the primitive *gc t*. The rule elicits a garbage collection cycle which drops entries that are no longer needed from the heap, and then evaluates *t*. The sub-heap Δ includes the portion of the current heap that is (transitively) *reachable* from the free variables in scope (i.e. those present in the term, fv<sup>∗</sup> (*t*, Δ) or on the stack fv<sup>∗</sup> (*S*, Δ)). After collection, the thread resumes and evaluates term *t* under compacted private heap Δ . <sup>10</sup> In rule [App-GC], a collection is *automatically* triggered when the thread's next memory allocation would overflow the heap.

<sup>10</sup> In practice a garbage collection cycle takes time that is proportional to the size of the memory used by the thread. That does not hinder security as long as the garbage collector runs on the thread's time budget.

**Resource Observations.** All threads in the system share a global fine-grained clock ω, which is incremented by the parallel scheduler at each cycle (see below). Rule [Time] gives all threads unrestricted access to the clock via monadic primitive *time*.

### **4.3 Parallel Scheduler**

This section extends LIOPAR with *deterministic parallelism*, which allows to execute κ threads simultaneously on as many cores. To this end, we introduce the top-level parallel scheduler, which coordinates simultaneous changes to the global state by updating the resource budgets of the threads in response core events (e.g., fork, spawn, and kill) and ticks the global clock.

$$\begin{array}{llll}\text{Que que }\Phi\in\{1\ldots\kappa\}\rightarrow\text{Que}\qquad\text{Configuation }c := \langle T,B,H,\theta,\omega,\theta\rangle\\\hline\\\text{Peak LUL}\\\forall i\in\{1\ldots\kappa\}\oplus\langle(n\_{i},\kappa\_{i},\ldots,\kappa\_{i})\rangle\_{\forall i\in\bigcup^{\ell}\mathcal{H}}\quad\langle\beta^{\ell}=\mathcal{E},T[n\_{i}\mapsto s\_{i}]\rangle\quad\langle\theta^{\ell}=\theta[i\mapsto Q]\\\quad c=\langle T',B,H,\theta,\omega,\omega+1,\theta^{\ell}\rangle\quad\langle\Sigma',\theta^{\ell}\rangle=\langle\text{sort}\{\langle n\_{i},\alpha\_{i}\rangle,\ldots,\langle n\_{i},\alpha\_{i}\rangle\}\rangle^{\ast}\\\hline\\n\text{next}(\ldots,\kappa)=c\\\text{next}(\ldots,\kappa)=n\\\text{next}(n\_{i},\mathtt{for}(A,n\_{i},n\_{i},b,h,h),c)=\langle T',B',H',\theta'\rangle\langle n\_{i}\mapsto N\_{i}\rangle,\omega\notin K\rangle\\\quad\langle\!\langle T',B',\{n\_{i}\mapsto b\},\mathcal{H}\rangle\_{\forall i\in\bigcup^{\ell}\mathcal{H}}\quad\langle\!\langle n\_{i},\mathcal{H},\theta'\rangle\rangle\langle n\_{i}\mapsto N\_{i}\rangle,\ \Phi'\rangle\\\hline\\\left\langle\!\langle T,B,H,\theta,\omega,\theta\rangle=\langle T',B',H,\theta',\omega,\theta\rangle\right\rangle&\triangleq \left\langle\!\langle\!\langle\Delta,\mathcal{H},\theta'\rangle\rangle\langle n\_{i}\mapsto N\_{i}\rangle\right\rangle\\\hline\\n\text{set}(\ldots\neq\Delta,n\_{i},\omega\in[\ell,\!])\\\left\langle\!\langle T',B,H,\theta'\rangle\langle n\_{i}\mapsto N\_{i}\rangle\\\quad\langle\!\langle T',B',\{n\_{i}\mapsto b\}\rangle\rangle\langle n\_{i}\mapsto N\_{i}\rangle\\\quad\langle\!\$$

**Fig. 5.** Top-level parallel scheduler.

**Semantics.** Figure 5 formalizes the operational semantics of the parallel scheduler, which reduces a configuration *c* = Σ,Φ consisting of global state Σ and core map Φ mapping each core to its run queue, to configuration *c* in one step, written *c* → *c* , through rule [Parallel] only. The rule executes the threads scheduled on each of the κ cores, which all step at once according to the concurrent semantics presented in Sects. 4.1–4.2, with the same current global state Σ. Since the execution of each thread can change Σ *concurrently*, the top-level parallel scheduler reconciles those actions by updating Σ *sequentially* and *deterministically*. <sup>11</sup> First, the scheduler updates the thread pool map T and core map Φ with the final state obtained by running each thread in isolation, i.e., *T* = Σ.T[*n*<sup>i</sup> → si] and Φ = Φ[i → *Q*i] for i ∈ {1 ..κ}. Then, it collects all concurrent events generated by the κ threads together with their thread id, sorts the events according to type, i.e., *sort* [(*n*1, *e*1), ...,(*n*κ, *e*κ)], and computes the updated configuration by processing the events in sequence.<sup>12</sup> In particular, new threads are created first (event **spawn**(·) and **fork**(·) ), and then killed (event **kill**(·))—the ordering between events of the same type is arbitrary and assumed to be fixed. Trivial events ( ) do not affect the configuration and thus their ordering is irrelevant. The function *es<sup>c</sup>* computes a final configuration by processing a list of events in order, accumulating configuration updates (next(·) updates the current configuration by one event-step): (*n*, *<sup>e</sup>*) : *es<sup>c</sup>* <sup>=</sup> *es*next(*n*,*e*,*c*) . When no more events need processing, the configuration is returned [ ]*<sup>c</sup>* <sup>=</sup> *<sup>c</sup>*.

**Event Processing.** Figure 5 defines function next(*n*, *e*, *c*), which takes a thread identifier *n*, the event *e* that thread *n* generated, the current configuration and outputs the configuration obtained by performing the thread's action. The empty event is trivial and leaves the state unchanged. Event (*n*1,**fork**(Δ, *n*2,*t*, *b*, h)) indicates that thread *n*<sup>1</sup> forks thread *t* with identifier *n*2, sub-heap Δ, time budget *b* and maximum heap size h. The scheduler deducts these resources from the parent's budgets, i.e., *B* = B[*n*<sup>1</sup> → B(*n*1) − *b*] and *H* = *H* [*n*<sup>1</sup> → *H* (*n*1) − h] and assigns them to the child, i.e., *B* [*n*<sup>2</sup> → *b*] and *H* [*n*<sup>2</sup> → <sup>h</sup>].<sup>13</sup> The new child shares the core with the parent—it has no core capabilities i.e., <sup>θ</sup> <sup>=</sup> <sup>θ</sup>[*n*<sup>2</sup> → <sup>∅</sup>] and so the core map is left unchanged. Lastly, the scheduler adds the child to the thread pool and initializes its state, i.e., <sup>T</sup>[*n*<sup>2</sup> → (Δ, *<sup>n</sup>*2.L, <sup>∅</sup> <sup>|</sup> *<sup>t</sup>*, [ ])]. The scheduler handles event (*n*1, **spawn**(Δ, *n*2,*t*, *K*)) similarly. The new thread *t* gets scheduled on core *n*2.*k*, i.e., Φ[*n*2.*k* → *n* B<sup>0</sup> <sup>2</sup> ], where the thread takes all the time and memory resources of the core, i.e., B[*n*<sup>2</sup> → B0] and *H* [*n*<sup>2</sup> → *H*0], and extra core capabilities *K*, i.e., θ [*n*<sup>2</sup> → *K*]. For simplicity, we assume that all cores execute B<sup>0</sup> steps per-cycle and feature a memory of size *H*0. Event (*n*, **kill**(*n* )) informs the scheduler that thread *n* wishes to kill thread *n* . The scheduler leaves the global state unchanged if the parent thread has already been killed by the time this event is handled, i.e., when the guard *n* ∈ *Dom*(T) is true—the resources of the child *n* will have been reclaimed by another ancestor.

<sup>11</sup> Non-deterministic updates would make the model vulnerable to refinement attacks [40].

<sup>12</sup> Since the clock only needs to be incremented, we could have left it out from the configuration *c* <sup>=</sup> *T*- ,B, *H* , θ, Σ.ω + 1, Φ- ; function *es <sup>c</sup>* does not use nor change its value.

<sup>13</sup> Notice that <sup>|</sup>Δ<sup>|</sup> < h by rule [Fork].

Otherwise, the scheduler collects the identifiers of the descendants of *n* that are *alive* (N = -{*n* }<sup>T</sup> )—they must be killed (and reclaimed) *transitively*. The set N is computed recursively by -N<sup>T</sup> , using the thread pool T, i.e., -∅<sup>T</sup> = ∅, -{*<sup>n</sup>* }<sup>T</sup> <sup>=</sup> {*<sup>n</sup>* } ∪ -T(*n*).N<sup>T</sup> and -<sup>N</sup><sup>1</sup> <sup>∪</sup> <sup>N</sup>2<sup>T</sup> <sup>=</sup> -<sup>N</sup>1<sup>T</sup> <sup>∪</sup> -N2<sup>T</sup> . The scheduler then increases the time and memory budget of the parent with the sum of the budget of all its descendants scheduled on the *same* core, i.e., - <sup>i</sup> <sup>∈</sup> N,i.*k*=*n*.*<sup>k</sup>* <sup>B</sup>(i) (resp. - <sup>i</sup> <sup>∈</sup> N,i.*k*=*n*.*<sup>k</sup> <sup>H</sup>* (i))—descendants running on other cores do not share those resources. The scheduler reassigns to the parent thread their core capabilities, which are split between capabilities explicitly assigned but not in use, i.e., <sup>i</sup> <sup>∈</sup> <sup>N</sup> <sup>θ</sup>(i) and core capabilities assigned and in use by running threads, i.e., {i.*k* | i ∈ N, i.*k* = *n*.*k* }. Lastly, the scheduler removes the killed threads from each core, written Φ(i) \ N, by pruning the leaves containing killed threads and reassigning their leftover time budget to their parent, see the extended version of this paper for details.

### **5 Security Guarantees**

In this section we show that LIOPAR satisfies a strong security condition that ensures timing-agreement of threads and rules out timing covert channels. In Sect. 5.1, we describe our proof technique based on *term erasure*, which has been used to verify security guarantees of functional programming languages [30], IFC libraries [8,17,54,56,61], and an IFC runtime system [59]. In Sect. 5.2, we formally prove security, i.e., *progress- and timing-sensitive non-interference*, a strong form of non-interference [14], inspired by Volpano and Smith [64] to our knowledge, it is considered here for the first time in the context of parallel runtime systems. Works that do not address external timing channels [59,62] normally prove *progress-sensitive* non-interference, wherein the number of execution steps of a program may differ in two runs based on a secret. This condition is insufficient in the parallel setting: both public and secret threads may step simultaneously on different cores and any difference in the number of execution steps would introduce *external* and *internal* timing attacks. Similar to previous works on secure multi-threaded systems [36,52], we establish a *strong* low-bisimulation property of the parallel scheduler, which guarantees that attacker-indistinguishable configurations execute in lock-step and remain indistinguishable. Theorem 1 and Corollary 1 use this property to ensure that any two related parallel programs execute in exactly the same number of steps.

#### **5.1 Erasure Function**

The term erasure technique relies on an *erasure function*, written ε*L*(·), which rewrites secret data above the attacker's level *L* to special term •, in all the syntactic categories: values, terms, heaps, stacks, global states and configurations.<sup>14</sup> Once the erasure function is defined, the core of the proof technique

<sup>14</sup> For ease of exposition, we use the two-point lattices {*L*, *H* }, where *H L* is the only disallowed flow. Neither our proofs nor our model rely on this particular lattice.

consists of proving an essential *commutativity* relationship between the erasure function and reduction steps: given a step *c* → *c* , there must exist a reduction that *simulates* the original reduction between the erased configurations, i.e., ε*L*(*c*) → ε*L*(*c* ). Intuitively, if the configuration *c* leaked secret data while stepping to *c* , that data would be classified as public in *c* and thus would remain in ε*L*(*c* )— but such secret data would be erased by ε*L*(*c*) and the property would not hold. The erasure function leaves ground values, e.g., (), unchanged and on most terms it acts homomorphically, e.g., ε*L*(*t*<sup>1</sup> *t*2) = ε*L*(*t*1) ε*L*(*t*2). The interesting cases are for labeled values, thread configurations, and resource maps. The erasure function removes the content of secret labeled values, i.e., ε*L*(*Labeled H t*◦) = *Labeled H* •, and erases the content recursively otherwise, i.e., ε*L*(*Labeled L t*◦) = *Labeled L* ε*L*(*t*)◦. The state of a thread is erased per-component, homomorphically if the program counter label is public, i.e., ε*L*(Δ, *L*,N, | *t*, *S*)=(ε*L*(Δ), *L*, N | ε*L*(*t*), ε*L*(*S*)), and in full otherwise, i.e., ε*L*(Δ, *H* ,N, | *t*, *S*)=(•, •, •|•, •).

**Resource Erasure.** Since LIOPAR manages resources explicitly, the simulation property above requires to define the erasure function for resources as well. The erasure function should *preserve* information about the resources (e.g., time, memory, and core capabilities) of *public threads*, since the attacker can explicitly assign resources (e.g., with *fork* and *swap*) and measure them (e.g., with *size*). But what about the resources of secret threads? One might think that such information is secret and thus it should be erased—intuitively, a thread might decide to assign, say, half of its time budget to its secret child depending on secret information. However, public threads can also assign (public) resources to a secret thread when forking: even though these resources currently belong to the secret child, they are *temporary*—the public parent might reclaim them later. Thus, we cannot associate the sensitivity of the resources of a thread with its program counter label when resources are managed *hierarchically*, as in LIOPAR. Instead, we associate the security level of the resources of a secret thread with the sensitivity of its parent: the resources of a secret thread are *public* information whenever the program counter label of the parent is public and *secret* information otherwise. Furthermore, since resource reclamation is transitive, the erasure function cannot discard secret resources, but must rather redistribute them to the hierarchically closest set of public resources, as when *killing* them.

**Time Budget.** First, we project the identifiers of *public* threads from the thread pool T : *DomL*(T) = {*n<sup>L</sup>* | *n* ∈ *Dom*(T) ∧ T(*n*).*pc* ≡ *L*}, where notation *n<sup>L</sup>* indicates that the program counter label of thread *n* is public. Then, the set *P* = *<sup>n</sup>* <sup>∈</sup> Dom*L*(T){*<sup>n</sup>* } ∪ <sup>T</sup>(*n*).N contains the identifiers of all the public threads and their immediate children.<sup>15</sup> The resources of threads *<sup>n</sup>* <sup>∈</sup> *<sup>P</sup>* are public information. However, the program counter label of a thread *n* ∈ *P* is not necessarily public, as explained previously. Hence *P* can be disjointly partitioned

<sup>15</sup> The id of the spinning thread on each free core is also public, i.e., ◦*<sup>k</sup>* <sup>∈</sup> *<sup>P</sup>* for *k* ∈ {<sup>1</sup> ..κ}.

by program counter label: *P* = *P<sup>L</sup>* ∪ *P<sup>H</sup>* , where *P<sup>L</sup>* = {*n<sup>L</sup>* | *n* ∈ *P* } and *P<sup>H</sup>* = {*n<sup>H</sup>* | *n* ∈ *P* }. Erasure of the budget map then proceeds on this partition, leaving the budget of the public threads untouched, and summing the budget of their secret children threads to the budgets of their descendants, which are instead omitted. In symbols, ε*L*(B) = B*<sup>L</sup>* ∪ B*<sup>H</sup>* , where B*<sup>L</sup>* = {*n<sup>L</sup>* → B(*nL*) | *n<sup>L</sup>* ∈ *P<sup>L</sup>* } and B*<sup>H</sup>* = {*n*<sup>H</sup> → B(*n*H) + - i ∈ -{*nH* }*<sup>T</sup>* <sup>B</sup>(i) <sup>|</sup> *<sup>n</sup>*<sup>H</sup> <sup>∈</sup> *<sup>P</sup><sup>H</sup>* }.

**Queue Erasure.** The erasure of core queues follows the same intuition, preserving public and secret threads *n* ∈ *P* and trimming all other secret threads *n<sup>H</sup>* ∈ *P*. Since queues annotate thread ids with their residual time budgets, the erasure function must reassign the budgets of all *secret* threads *n <sup>H</sup>* ∈ *P* to their closest ancestor *n* ∈ *P* on the same core. The ancestor *n* ∈ *P* could be either (i) another *secret* thread on the same core, i.e., *n<sup>H</sup>* ∈ *P*, or, (ii) the spinning thread of that core, ◦ ∈ *P* if there is no other thread *n* ∈ *P* on that core—the difference between these two cases lies on whether the original thread *n* was *forked* or *spawned* on that core. More formally, if the queue contains no thread *n* ∈ *P*, then the function replaces the queue altogether with the spinning thread and returns the residual budgets of the threads to it, i.e., <sup>ε</sup>*L*(*Q*) = ◦<sup>B</sup> if *n*<sup>i</sup> ∈ *P* and B = *<sup>b</sup>*i, for each leaf *<sup>Q</sup>*[*n<sup>b</sup><sup>i</sup>* <sup>i</sup> ] where i ∈ {1 . . |*Q*|}. Otherwise, the core contains at least a thread *n<sup>H</sup>* ∈ *P* and the erasure function returns the residual time budget of its secret descendants, i.e., ε*L*(*Q*) = *Q* ↓*<sup>L</sup>* by combining the effects of the following mutually recursive functions:

$$\begin{array}{cc} \langle n^b \rangle \downarrow\_L = \langle n^b \rangle\\ \langle Q\_1, Q\_2 \rangle \downarrow\_L = \langle Q\_1 \downarrow\_L \rangle \vee \langle Q\_2 \downarrow\_L \rangle \end{array} \qquad \begin{array}{c} \langle n\_{1H}^{b\_1} \rangle \vee \langle n\_{2H}^{b\_2} \rangle = \langle n\_{1H}^{b\_1 + b\_2} \rangle \ \langle n\_{1H}^{b\_1} \rangle = \langle n\_{1H}^{b\_1 + b\_2} \rangle \\\ Q\_1 \vee Q\_2 = \langle Q\_1, Q\_2 \rangle \end{array}$$

The interesting case is *n<sup>b</sup>*<sup>1</sup> <sup>1</sup>H *n<sup>b</sup>*<sup>2</sup> <sup>2</sup>H, which reassigns the budget of the child (the right leaf *n<sup>b</sup>*<sup>2</sup> <sup>2</sup>H) to the parent (the left leaf *n<sup>b</sup>*<sup>1</sup> <sup>1</sup>H), by rewriting the subtree into *n<sup>b</sup>*1+*b*<sup>2</sup> <sup>1</sup><sup>H</sup> .

#### **5.2 Timing-Sensitive Non-interference**

The proof of progress- and timing-sensitive non-interference relies on two fundamental properties, i.e., *determinacy* and *simulation* of parallel reductions. Determinacy requires that the reduction relation is deterministic.

### **Proposition 1 (Determinism).** *If c*<sup>1</sup> → *c*<sup>2</sup> *and c*<sup>1</sup> → *c*<sup>3</sup> *then c*<sup>2</sup> ≡ *c*3*.*

The equivalence in the statement denotes alpha-equivalence, i.e., up to the choice of variable names. We now show that the parallel scheduler preserves *L*-equivalence of parallel configurations.

**Definition 1 (***L***-equivalence).** *Two configurations c*<sup>1</sup> *and c*<sup>2</sup> *are indistinguishable from an attacker at security level L, written c*<sup>1</sup> ≈*<sup>L</sup> c*2*, if and only if* ε*L*(*c*1) ≡ ε*L*(*c*2)*.*

**Proposition 2 (Parallel simulation).** *If c* → *c , then* ε*L*(*c*) → ε*L*(*c* )*.* By combining *determinism* (Proposition 1) and *parallel simulation* (Proposition 2), we prove *progress-insensitive non-interference*, which assumes progress of both configurations.

**Proposition 3 (Progress-insensitive non-interference).** *If c*<sup>1</sup> → *c* <sup>1</sup>*, c*<sup>2</sup> → *c* <sup>2</sup> *and c*<sup>1</sup> ≈*<sup>L</sup> c*2*, then c* <sup>1</sup> ≈*<sup>L</sup> c* 2*.*

In order to lift this result to be progress-sensitive, we first prove *timing-sensitive progress*. Intuitively, if a *valid* configuration steps then any low equivalent parallel configuration also steps.<sup>16</sup>

**Proposition 4 (Timing-sensitive progress).** *Given a valid configuration c*<sup>1</sup> *and a parallel reduction step c*<sup>1</sup> → *c* <sup>1</sup> *and c*<sup>1</sup> ≈*<sup>L</sup> c*2*, then there exists c* <sup>2</sup>*, such that c*<sup>2</sup> → *c* 2*.*

Using progress-insensitive non-interference, i.e., Proposition 3 and timingsensitive progress, i.e., Proposition 4 in combination, we obtain a *strong L*bisimulation property between configurations and prove *progress- and timingsensitive non-interference*.

**Theorem 1 (Progress- and timing-sensitive non-interference).** *For all valid configurations c*<sup>1</sup> *and c*2*, if c*<sup>1</sup> → *c* <sup>1</sup> *and c*<sup>1</sup> ≈*<sup>L</sup> c*2*, then there exists a configuration c* <sup>2</sup>*, such that c*<sup>2</sup> → *c* <sup>2</sup> *and c* <sup>1</sup> ≈*<sup>L</sup> c* 2*.*

The following corollary instantiates the non-interference security theorem from above for a given LIOPAR parallel program, that explicitly rules out leaks via timing channels. In the following, the notation →*<sup>u</sup>* denotes *u* reduction steps of the parallel scheduler.

**Corollary 1.** *Given a well-typed LIO*PAR *program t of type Labeled* τ<sup>1</sup> → *LIO* τ<sup>2</sup> *and two closed secrets t*◦ <sup>1</sup> ,*t*◦ <sup>2</sup> :: <sup>τ</sup>1*, let* <sup>s</sup><sup>i</sup> = ([ ], *<sup>L</sup>*, <sup>∅</sup>, <sup>|</sup> *<sup>t</sup>* (*Labeled <sup>H</sup> <sup>t</sup>*◦ <sup>i</sup> ), [ ])*, c*<sup>i</sup> = (Ti,B, *H* , θ, 0, Φi)*, where* T<sup>i</sup> = [*n<sup>L</sup>* → si, ◦*<sup>j</sup>* → s◦ ]*,* B = [*n<sup>L</sup>* → B0, ◦*<sup>j</sup>* → 0]*, <sup>H</sup>* = [*n<sup>L</sup>* → *<sup>H</sup>*0, ◦*<sup>j</sup>* → *<sup>H</sup>*<sup>0</sup> ]*,* <sup>θ</sup> = [*n<sup>L</sup>* → {<sup>2</sup> ..κ}, ◦*<sup>j</sup>* → <sup>∅</sup>]*,* <sup>Φ</sup><sup>i</sup> = [1 → si, <sup>2</sup> → ◦2, ..., κ → ◦κ]*, for* i ∈ {1, 2}*, j* ∈ {1 ..κ} *and thread identifier n<sup>L</sup> such that n*.*k* = 1 *and n*.*cl* = *H . If c*<sup>1</sup> →*<sup>u</sup> c* 1*, then there exists configuration c* 2*, such that c*<sup>2</sup> →*<sup>u</sup> c* <sup>2</sup> *and c* <sup>1</sup> ≈*<sup>L</sup> c* 2*.*

To conclude, we show that the *timing-sensitive* security guarantees of LIOPAR extend to concurrent *single-core* programs by instantiating Corollary 1 with κ = 1.

### **6 Limitations**

**Implementation.** Implementing LIOPAR is a serious undertaking that requires a major redesign of GHC's runtime system. Conventional runtime systems freely

<sup>16</sup> A configuration is valid if satisfies several basic properties, e.g., it does not contain special term •. See the extended version of this paper for details.

share resources among threads to boost performance and guarantee fairness. For instance, in GHC, threads share heap objects to save memory space and execution time (when evaluating expressions). In contrast, LIOPAR strictly partitions resources to enforce security—threads at different security labels cannot share heap objects. As a result, the GHC memory allocator must be adapted to isolate threads' private heap, so that allocation and collection can occur independently and in parallel. Similarly, the GHC "fair" round robin scheduler must be heavily modified to keep track of and manage threads' time budget, to preemptively perform a context switch when their time slice is up.

**Programming Model.** Since resource management is explicit, building applications atop LIOPAR introduces new challenges—the programmer must explicitly choose resource bounds for each thread. If done poorly, threads can spend excessive amounts of time sitting idle when given too much CPU time, or garbage collecting when not given enough heap space. The problem of tuning resource allocation parameters is not unique to LIOPAR—Yang and Mazi`eres' [66] propose to use GHC profiling mechanisms to determine heap size while the realtime garbage collector by Henriksson [16] required the programmer to specify the worst case execution time, period, and worst-case allocation of each high-priority thread. Das and Hoffmann [9] demonstrate a more automatic approach—they apply machine learning techniques to statically determine upper bounds on execution time and heap usage of OCaml programs. Similar techniques could be applied to LIOPAR in order to determine the most efficient resource partitions. Moreover, this challenge is not unique to real-time systems or LIOPAR; choosing privacy parameters in differential privacy, for example, shares many similarities [21,29].

The LIOPAR programming model is also likely easier to use in certain application domains—e.g., web applications where the tail latency of a route can inform the thread bounds, or embedded systems where similar latency requirements are the norm. Nevertheless, in order to simplify programming with LIOPAR, we intend to introduce privileges (and thus declassification) similar to LIO [12,56] or COWL [57].

Coarse-grained, floating-label systems such as LIO and LIOPAR can suffer *label creep*, wherein the current computation gets tainted to a point where it cannot perform any useful writes [55]. Sequential LIO [56] addresses label creep through a primitive, toLabeled, which executes a computation (that may raise the current label) in a separate context and restores the current label upon its termination. Similar to concurrent LIO [54], LIOPAR relies on fork to address label creep and not toLabeled—the latter exposes the termination covertchannel [54]. Even though LIOPAR has a more restricted floating-label semantics than concurrent LIO, LIOPAR also supports parallel execution, garbage collection, and new APIs for getting heap statistics, counting elapsed time, and killing threads.

### **7 Related Work**

There is substantial work on language-level IFC systems [10,15,20,34,43,50,51, 54,55,67,68]. Our work builds on these efforts in several ways. Firstly, LIOPAR extends the concurrent LIO IFC system [54] with parallelism—to our knowledge, this is the first *dynamic* IFC system to support parallelism and address the internalization of external timing channels. Previous static IFC systems implicitly allow for parallelism, e.g., Muller and Chong's [41], several works on IFC π-calculi [18,19,25], and Rafnsson et al. [49] recent foundations for composable timing-sensitive interactive systems. These efforts, however, do not model runtime system resource management. Volpano and Smith [64] enforce a timing agreement condition, similar to ours, but for a static concurrent IFC system. Mantel et al. [37] and Li et al. [31] prove non-interference for static, concurrent systems, using rely-guarantee reasoning.

Unlike most of these previous efforts, our hierarchical runtime system also eliminates classes of resource-based external timing channels, such as memory exhaustion and garbage collection. Pedersen and Askarov [46], however, were the first to identify automatic memory management to be a source of covert channels for IFC systems and demonstrate the feasibility of attacks against both V8 and the JVM. They propose a sequential static IFC language with labeledpartitioned memory and a label-aware timing-sensitive garbage collector, which is vulnerable to *external timing* attacks and satisfies only *termination-insensitive* non-interference.

Previous work on language-based systems—namely [35,66]—identify memory retention and memory exhaustion as a source of denial-of-service (DOS) attacks. Memory retention and exhaustion can also be used as covert channels. In addressing those covert channels, LIOPAR also addresses the DOS attacks outlined by these efforts. Indeed, our work generalizes Yang and Mazi`eres' [66] region-based allocation framework with region-based garbage collection and hierarchical scheduling.

Our LIOPAR design also borrows ideas from the secure operating system community. Our explicit hierarchical memory management is conceptually similar to HiStar's container abstraction [69]. In HiStar, containers—subject to quotas, i.e., space limits—are used to hierarchically allocate and deallocate objects. LIOPAR adopts this idea at the language-level and automates the allocation and reclamation. Moreover, we hierarchically partition CPU-time; Zeldovich et al. [69], however, did observe that their container abstraction can be repurposed to enforce CPU quotas. Deterland [65] splits time into ticks to address internal timing channels and mitigate external timing ones. Deterland builds on Determinator [4], an OS that executes parallel applications deterministically and efficiently. LIOPAR adopts many ideas from these systems—both the deterministic parallelism and ticks (semantic steps)—to the language-level. Deterministic parallelism at the language-level has also been explored previous to this work [27,28,38], but, different from these efforts, LIOPAR also hierarchically manages resources to eliminate classes of external timing channels.

Fabric [33,34] and DStar [70] are distributed IFC systems. Though we believe that our techniques would scale beyond multi-core systems (e.g., to data centers), LIOPAR will likely not easily scale to large distributed systems like Fabric and DStar. Different from Fabric and DStar, however, LIOPAR addresses both internal and external timing channels that result from running code in parallel.

Our hierarchical resource management approach is not unique—other countermeasures to external timing channels have been studied. Hu [22], for example, mitigates both timing channels in the VAX/VMM system [32] using "fuzzy time"—an idea recently adopted to browsers [26]. Askarov et al.'s [2] mitigate external timing channels using predicative black-box mitigation, which delays events and thus bound information leakage. Rather than using noise as in the fuzzy time technique, however, they predict the schedule of future events. Some of these approaches have also been adopted at the language-level [46,54,71]. We find these techniques largely orthogonal: they can be used alongside our techniques to mitigate timing channels we do not eliminate.

Real-time systems—when developed with garbage collected languages [3,5,6, 16]—face similar challenges as this work. Blelloch and Cheng [6] describe a realtime garbage collector (RTGC) for multi-core programs with *provable* resource bounds—LIOPAR *enforces* resource bounds instead. A more recent RTGC created by Auerbach et al. [3] describes a technique to "tax" threads into contributing to garbage collection as they utilize more resources. Henricksson [16] describes a RTGC capable of enforcing hard and soft deadlines, once given upper bounds on space and time resources used by threads. Most similarly to LIOPAR, Pizlo et al. [48] implement a hierarchical RTGC algorithm that independently collects partitioned heaps.

### **8 Conclusion**

Language-based IFC systems built atop off-the-shelf runtime systems are vulnerable to resource-based external-timing attacks. When these systems are extended with thread parallelism these attacks become yet more vicious—they can be carried out internally. We presented LIOPAR, the design of the first dynamic IFC hierarchical runtime system that supports deterministic parallelism and eliminate s both resource-based internal- and external-timing covert channels. To our knowledge, LIOPAR is the first parallel system to satisfy progress- and timingsensitive non-interference.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A Formal Analysis of Timing Channel Security via Bucketing**

Tachio Terauchi1(B) and Timos Antonopoulos<sup>2</sup>

<sup>1</sup> Waseda University, Tokyo, Japan terauchi@waseda.jp <sup>2</sup> Yale University, New Haven, USA timos.antonopoulos@yale.edu

**Abstract.** This paper investigates the effect of *bucketing* in security against timing channel attacks. Bucketing is a technique proposed to mitigate timing-channel attacks by restricting a system's outputs to only occur at designated time intervals, and has the effect of reducing the possible timing-channel observations to a small number of possibilities. However, there is little formal analysis on when and to what degree bucketing is effective against timing-channel attacks. In this paper, we show that bucketing is in general insufficient to ensure security. Then, we present two conditions that can be used to ensure security of systems against adaptive timing channel attacks. The first is a general condition that ensures that the security of a system decreases only by a limited degree by allowing timing-channel observations, whereas the second condition ensures that the system would satisfy the first condition when bucketing is applied and hence becomes secure against timing-channel attacks. A main benefit of the conditions is that they allow *separation of concerns* whereby the security of the regular channel can be proven independently of concerns of side-channel information leakage, and certain conditions are placed on the side channel to guarantee the security of the whole system. Further, we show that the bucketing technique can be applied compositionally in conjunction with the constant-time-implementation technique to increase their applicability. While we instantiate our contributions to timing channel and bucketing, many of the results are actually quite general and are applicable to any side channels and techniques that reduce the number of possible observations on the channel.

### **1 Introduction**

*Side-channel attacks* aim to recover a computer system's secret information by observing the target system's side channels such as cache, power, timing and electromagnetic radiation [11,15–17,21,23–25,31,36]. They are well recognized as a serious threat to the security of computer systems. *Timing-channel* (or simply *timing*) *attacks* are a class of side-channel attacks in which the adversary makes observations on the system's running time. Much research has been done to detect and prevent timing attacks [1,3,4,6,7,9,18,20,22,26,27,30,41].

*Bucketing* is a technique proposed for mitigating timing attacks [7,14,26,27,41]. It restricts the system's outputs to only occur at designated time intervals. Therefore, bucketing has the effect of reducing the possible timingchannel observations to a small number of possibilities. This is at some cost of system's performance because outputs must be delayed to the next bucket time. Nonetheless, in comparison to the *constant-time implementation* technique [1,3,6,9,20,22] which restricts the system's running time to be independent of secrets, bucketing is often said to be more efficient and easier to implement as it allows running times to vary depending on secrets [26,27].<sup>1</sup> For example, bucketing may be implemented in a blackbox-style by a monitor that buffers and delays outputs [7,41].

In this paper, we formally study the effect of bucketing on security against *adaptive* timing attacks. To this end, first, we give a formal notion of security against adaptive side-channel-observing adversaries, called (f, -)*-security*. Roughly, (f, -)-security says that the probability that an adversary can recover the secret by making at most f(n) many queries to the system is bounded by -(n), where n is the security parameter.

Next, we show that bucketing alone is in general insufficient to guarantee security against adaptive side-channel attacks by presenting a counterexample that has only two timing observations and yet is efficiently attackable. This motivates a search for conditions sufficient for security. We present a condition, called *secret-restricted side-channel refinement* (SRSCR), which roughly says that a system is secure if there are sufficiently large subsets of secrets such that (1) the system's side channel reveals no more information than the regular channel on the subsets and (2) the system is secure on the subsets against adversaries who only observe the regular channel. The degree of security (i.e., f and -) is proportional to that against regular-channel-only-observing adversaries and the size of the subsets.

Because of the insufficiency of bucketing mentioned above, applying bucketing to an arbitrary system may not lead to a system that satisfies SRSCR (for good f and -). To this end, we present a condition, called *low-input side-channel non-interference* (LISCNI). We show that applying bucketing to a system that satisfies the condition would result in a system that satisfies SRSCR. Therefore, LISCNI is a sufficient condition for security under the bucketing technique. Roughly, LISCNI says that (1) the side-channel observation does not depend on attacker-controlled inputs (but may depend on secrets) and (2) the system is secure against adversaries who only observe the regular channel. The degree of security is proportional to that against regular-channel-only-observing adversaries and the granularity of buckets. A main benefit of the conditions SRSCR and LISCNI is that they allow *separation of concerns* whereby the security of the regular channel can be proven independently of concerns of side-channel

<sup>1</sup> Sometimes, the terminology "constant-time implementation" is used to mean even stricter requirements, such as requiring control flows to be secret independent [3,9]. In this paper, we use the terminology for a more permissive notion in which only the running time is required to be secret independent.

information leakage, and certain conditions are placed on the side channel to guarantee the security of the whole system.

Finally, we show that the bucketing technique can be applied in a compositional manner with the constant-time implementation technique. Specifically, we show that when a system is a sequential composition of components in which one component is constant-time and the other component LISCNI, the whole system can be made secure by applying bucketing only to the non-constant-time part. We show that the combined approach is able to ensure security of some non-constant-time systems that cannot be made secure by applying bucketing to the whole system. We summarize the main contributions below.


While the paper focuses on timing channels and bucketing, many of the results are actually quite general and are applicable to side channels other than timing channels. Specifically, aside from the compositional bucketing result that exploits the "additive" nature of timing channels (cf. Sect. 3.3), the results are applicable to any side channels and techniques that reduce the number of possible sidechannel observations

The rest of the paper is organized as follows. Section 2 formalizes the setting, and defines (f, -)-security which is a formal notion of security against adaptive side-channel attacks. We also show that bucketing is in general insufficient to guarantee security of systems against adaptive side-channel attacks. Section 3 presents sufficient conditions for ensuring (f, -)-security: SRSCR and LISCNI. We show that they facilitate proving the security of systems by allowing system designers to prove the security of regular channels separately from the concern of side channels. We also show that the LISCNI condition may be used in combination with the constant-time implementation technique in a compositional manner so as to prove the security of systems that are neither constant-time nor can be made secure by (globally) applying bucketing. Section 4 discusses related work. Section 5 concludes the paper with a discussion on future work.

#### **2 Security Against Adaptive Side-Channel Attacks**

Formally, a *system* (or, *program*) is a tuple (rc,sc, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc) where rc and sc are indexed families of functions (indexed by the security parameter) that represent the regular-channel and side-channel input-output relation of the system, respectively. S is a security-parameter-indexed family of sets of *secrets* (or, *high inputs*) and I is a security-parameter-indexed family of sets of *attackercontrolled inputs* (or, *low inputs*). A *security parameter* is a natural number that represents the size of secrets, and we write S*<sup>n</sup>* for the set of secrets of size n and I*<sup>n</sup>* for the set of corresponding attacker-controlled inputs. Each indexed function rc*<sup>n</sup>* (respectively sc*n*) is a function from <sup>S</sup>*<sup>n</sup>* × I*<sup>n</sup>* to <sup>O</sup>rc *<sup>n</sup>* (resp. <sup>O</sup>sc *<sup>n</sup>* ), where <sup>O</sup>rc and <sup>O</sup>sc are indexed families of sets of possible regular-channel and side-channel outputs, respectively. For (s, v) ∈ S*<sup>n</sup>* × I*n*, we write rc*n*(s, v) (resp. sc*n*(s, v)) for the regular-channel (resp. side-channel) output given the secret s and the attacker-controlled input v. <sup>2</sup> For a system <sup>C</sup> = (rc,sc, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc), we often write rcC for rc, scC for sc, SC for <sup>S</sup>, IC for <sup>I</sup>, <sup>O</sup>rcC for <sup>O</sup>rc, and <sup>O</sup>scC for <sup>O</sup>sc. We often omit "C" when it is clear from the context.

For a system C and s ∈ S*n*, we write C*n*(s) for the *oracle* which, given <sup>v</sup> ∈ I*n*, returns a pair of outputs (o1, o2) ∈ Orc *<sup>n</sup>* × Osc *<sup>n</sup>* such that rc*n*(s, v) = o<sup>1</sup> and sc*n*(s, v) = o2. An *adversary* A is an algorithm that attempts to discover the secret by making some number of oracle queries. As standard, we assume that <sup>A</sup> has the full knowledge of the system. For <sup>i</sup> <sup>∈</sup> <sup>N</sup>, we write <sup>A</sup>*<sup>C</sup>n*(*s*) (i) for the adversary A that makes at most i oracle queries to C*n*(s). We impose no restriction on how the adversary chooses the inputs to the oracle. Importantly, he may choose the inputs based on the outputs of previous oracle queries. Such an adversary is said to be *adaptive* [25].

Also, for generality, we intentionally leave the computation class of adversaries unspecified. The methods presented in this paper work for any computation class, including the class of polynomial time randomized algorithms and the class of resource-unlimited randomized algorithms. The former is the standard for arguing the security of cryptography algorithms, and the latter ensures information theoretic security. In what follows, unless specified otherwise, we assume that the computation class of adversaries is the class of resource-unlimited randomized algorithms.

As standard, we define security as the bound on the probability that an adversary wins a certain game. Let f be a function from N to N. We define *Win*A(n, f) to be the event that the following game outputs true.

$$\begin{array}{l} s \leftarrow \mathcal{S}\_n\\ s' \leftarrow \mathcal{A}^{C\_n(s)}(f(n))\\ \text{Output } s = s' \end{array}$$

Here, the first line selects s uniformly at random from S*n*. We note that, while we restrict to deterministic systems, the adversary algorithm A may be probabilistic and also the secret s is selected randomly. Therefore, the full range of probabilities is possible for the event *Win*A(n, f). Now, we are ready to give the definition of (f, -)-security.

<sup>2</sup> We restrict to deterministic systems in this paper. Extension to probabilistic systems is left for future work.

**Definition 1 (**(f, -)**-security).** Let <sup>f</sup> : <sup>N</sup> <sup>→</sup> <sup>N</sup> and - : <sup>N</sup> <sup>→</sup> <sup>R</sup> be such that 0 <sup>&</sup>lt; -(n) <sup>≤</sup> 1 for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>. We say that a system is (f, -)*-secure* if there exists <sup>N</sup> <sup>∈</sup> <sup>N</sup> such that for all adversaries A and n ≥ N, it holds that Pr[*Win*A(n, f)] < -(n).

Roughly, (f, -)-secure means that, for all sufficiently large n, there is no attack that is able to recover secrets in f(n) number of queries with the probability of success -(n).

By abuse of notation, we often implicitly treat an expression e on the security parameter <sup>n</sup> as the function λn∈N.e. Therefore, for example, (n, -)-secure means that there is no attack that is able to recover secrets in n many queries with the probability of success -(n), and (f, 1)-secure means that there is no attack that makes at most f(n) number of queries and is always successful. Also, by abuse of notation, we often write - ≤ when -(n) ≤ - (n) for all sufficiently large n, and likewise for -<- .

**Fig. 1.** Timing insecure login program

*Example 1 (Leaky Login).* Consider the program shown in Fig. 1 written in a C-like language. The program is an abridged version of the timing insecure login program from [6]. Here, pass is the secret and guess is the attacker-controlled input, each represented as a length n bit array. We show that there is an efficient adaptive timing attack against the program that recovers the secret in a linear number of queries.

We formalize the program as the system <sup>C</sup> where for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>,


Here, a*<sup>i</sup>* denotes the length i prefix of a. Note that sc expresses the timingchannel observation, as its output corresponds to the number of times the loop iterated.

For a secret <sup>s</sup> ∈ S*n*, the adversary <sup>A</sup>*<sup>C</sup>n*(*s*) (n) efficiently recovers s as follows. He picks an arbitrary v<sup>1</sup> ∈ I*<sup>n</sup>* as the initial guess. By seeing the timing-channel output sc*n*(s, v1), he would be able to discover at least the first bit of s, s[0], because s[0] = v1[0] if and only if sc*n*(s, v1) > 0. Then, he picks an arbitrary <sup>v</sup><sup>2</sup> ∈ {0, <sup>1</sup>}*<sup>n</sup>* satisfying <sup>v</sup>2[0] = <sup>s</sup>[0], and by seeing the timing-channel output, he would be able to discover at least up to the second bit of s. Repeating the process n times, he will recover all n bits of s. Therefore, the system is not (n, -)-secure for any -. This is an example of an adaptive attack since the adversary crafts the next input by using the knowledge of previous observations.

*Example 2 (Bucketed Leaky Login).* Next, we consider the security of the program from Example 1 but with bucketing applied. Here, we assume a constant number of buckets, k, such that the program returns its output at time intervals <sup>i</sup> · n/k for <sup>i</sup> ∈ {<sup>j</sup> <sup>∈</sup> <sup>N</sup> <sup>|</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup>}. <sup>3</sup> (For simplicity, we assume that n is divisible by k.) The bucketed program can be formalized as the system where


where *bkt*(i, j) is the smallest <sup>a</sup> <sup>∈</sup> <sup>N</sup> such that <sup>i</sup> <sup>≤</sup> <sup>a</sup> · <sup>j</sup>. It is easy to see that the system is not constant-time for any k > 1. Nonetheless, we can show that the system is (f, -)-secure where <sup>f</sup>(n)=2*n/k* <sup>−</sup> (<sup>N</sup> + 1) and -(n)=1 <sup>−</sup> *<sup>N</sup>*−<sup>1</sup> <sup>2</sup>*n/k* for any 1 <sup>≤</sup> N < <sup>2</sup>*n/k*. Note that as <sup>k</sup> approaches 1 (and hence the system becomes constant-time), <sup>f</sup> approaches 2*<sup>n</sup>* <sup>−</sup> (<sup>N</sup> + 1) and approaches 1 <sup>−</sup> *<sup>N</sup>*−<sup>1</sup> <sup>2</sup>*<sup>n</sup>* , which match the security bound of the ideal login program that only leaks whether the input guess matched the password or not. We will show that the approach presented in Sect. 3.1 can be used to derive such a bound.

#### **2.1 Insufficiency of Bucketing**

We show that bucketing is in general insufficient to guarantee the security of systems against adaptive side-channel attacks. In fact, we show that bucketing with even just two buckets is insufficient. (Two is the minimum number of buckets that can be used to show the insufficiency because having only one bucket implies that the system is constant-time and therefore is secure.) More generally, our result applies to any side channels, and it shows that there are systems with just two possible side-channel outputs and completely secure (i.e., non-interferent [19,37]) regular channel that is efficiently attackable by sidechannel-observing adversaries.

Consider the system such that, for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>,


Note that the regular channel rc only has one possible output and therefore is non-interferent. The side channel sc has just two possible outputs. The side channel, given an attacker-controlled input v ∈ I*n*, reveals the v-th bit of s.

<sup>3</sup> A similar analysis can be done for any strictly sub-linear number of buckets.

It is easy to see that the system is linearly attackable. That is, for any secret s ∈ S*n*, the adversary may recover the entire n bits of s by querying with each of the n-many possible attacker-controlled inputs. Therefore, the system is not (n, -)-secure for any -. Note that the side channel is easily realizable as a timing channel, for example, by having a branch with the branch condition "s[v] = 0" and different running times for the branches.

We remark that the above attack is not adaptive. Therefore, the counterexample actually shows that bucketing can be made ineffective by just allowing multiple non-adaptive side-channel observations. We also remark that the counterexample shows that some previously proposed measures are insufficient. For example, the *capacity* measure [5,28,33,39] would not be able to detect the vulnerability of the example, because the measure is equivalent to the log of the number of possible outputs for deterministic systems.

### **3 Sufficient Conditions for Security Against Adaptive Side-Channel Attacks**

In this section, we present conditions that guarantee the security of systems against adaptive side-channel-observing adversaries. The condition SRSCR presented in Sect. 3.1 guarantees that systems that satisfy it are secure, whereas the condition LISCNI presented in Sect. 3.2 guarantees that systems that satisfy it become secure once bucketing is applied. We shall show that the conditions facilitate proving (f, -)-security of systems by separating the concerns of regular channels from those of side channels. In addition, we show in Sect. 3.3 that the LISCNI condition may be used in combination with constant-time implementation techniques in a compositional manner so as to prove the security of systems that are neither constant-time nor can be made secure by (globally) applying bucketing.

#### **3.1 Secret-Restricted Side-Channel Refinement Condition**

We present the *secret-restricted side-channel refinement* condition (SRSCR). Informally, the idea here is to find large subsets of secrets S ⊆ P(S*n*) such that for each S ∈ S , the secrets are difficult for an adversary to recover by only observing the regular channel, and that the side channel reveals no more information than the regular channel for those sets of secrets. Then, because S is large, the entire system is also ensured to be secure with high probability. We adopt *refinement order* [29,38], which had been studied in quantitative information flow research, to formalize the notion of "reveals no more information". Roughly, a channel C<sup>1</sup> is said to be a refinement of a channel C<sup>2</sup> if, for every attacker-controlled input, every pair of secrets that C<sup>2</sup> can distinguish can also be distinguished by C1.

We write O• for the indexed family of sets such that O• *<sup>n</sup>* <sup>=</sup> {•} for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>. Also, we write sc• for the indexed family of functions such that sc• *<sup>n</sup>*(s, v) = • for all <sup>n</sup> <sup>∈</sup> <sup>N</sup> and (s, v) ∈ S*n*×I*n*. For <sup>C</sup> = (rc,sc, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc), we write <sup>C</sup>• for the system (rc,sc•, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>•). We define the notion of *regular-channel security*.

**Definition 2 (Regular-channel** (f, -)**-security).** We say that the C is *regular-channel* (f, -)*-secure* if C• is (f, -)-secure.

Roughly, regular-channel security says that the system is secure against attacks that only observe the regular channel output.

Let us fix a system <sup>C</sup> = (rc,sc, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc). For an indexed family of sets of sets of secrets S (i.e., S *<sup>n</sup>* ⊆ P(S*n*) for each n), we write S ≺ S when S is an indexed family of sets of secrets such that S *<sup>n</sup>* ∈ S *<sup>n</sup>* for each n. Note that such S satisfies S *<sup>n</sup>* ⊆ S*<sup>n</sup>* for each n. Also, for S ≺ S , we write C|*<sup>S</sup>*- for the system that is equal to C except that its secrets are restricted to S, that is, (rc,sc, S, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc). Next, we formalize the SRSCR condition.

**Definition 3 (Secret-Restricted Side-Channel Refinement).** Let <sup>f</sup> : <sup>N</sup> <sup>→</sup> N, - : <sup>N</sup> <sup>→</sup> (0, 1], and 0 < r <sup>≤</sup> 1. We say that the system <sup>C</sup> = (rc,sc, <sup>S</sup>, <sup>I</sup>, <sup>O</sup>rc, <sup>O</sup>sc) satisfies the *secret-restricted side-channel refinement* condition with <sup>f</sup>, -, and r, written SRSCR(f, -, r), if there exists an indexed family of sets of sets of secrets S*res* such that S*res <sup>n</sup>* ⊆ P(S*n*) for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, and:

(1) For all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>r</sup> ≤ | - S*res <sup>n</sup>* |/|S*n*|;


Condition (2) says that the system is regular-channel (f, -)-secure when restricted to any subset of secrets <sup>S</sup> <sup>≺</sup> <sup>S</sup>*res* . Condition (3) says that the system's side channel reveals no more information than its regular channel for the restricted secret subsets. Condition (1) says that the ratio of the restricted set over the entire space of secrets is at least r. 4

We informally describe why SRSCR is a sufficient condition for security. The condition guarantees that, for the restricted secrets S*res* , the attacker gains no additional information by observing the side-channel compared to what he already knew by observing the regular channel. Then, because r is a bound on the probability that a randomly selected secret falls in S*res* , the system is secure provided that r is suitably large and the system is regular-channel secure. The theorem below formalizes the above intuition.

**Theorem 1 (**SRSCR **Soundness).** *Suppose* C *satisfies* SRSCR(f, -, r)*. Then,* C *is* (f, - )*-secure, where* - = 1 − r(1 − -)*.*

*Proof.* Let S*res* be an indexed family of sets of secret subsets that satisfies conditions (1), (2), and (3) of SRSCR(f, -, r). By condition (2), for all sufficiently large <sup>n</sup> and adversaries <sup>A</sup>, Pr[*Win*•*,res* <sup>A</sup> (n, f)] < -(n) where *Win*•*,res* <sup>A</sup> (n, f) is the modified game in which the oracle C*n*(s) always outputs • as its side-channel output and the secret s is selected randomly from - S*res <sup>n</sup>* (rather than from S*n*).

<sup>4</sup> It is easy to relax the notion to be asymptotic so that the conditions need to hold only for <sup>n</sup> <sup>≥</sup> <sup>N</sup> for some <sup>N</sup> <sup>∈</sup> <sup>N</sup>.

For any n, the probability that a randomly selected element from S*<sup>n</sup>* is in - S*res <sup>n</sup>* is at least r by condition (1). That is, Pr[s ∈ - S*res <sup>n</sup>* | s ← S*n*] ≥ r. Also, Pr[¬*Win*•*,res* <sup>A</sup> (n, f)] <sup>&</sup>gt; <sup>1</sup> <sup>−</sup> -(n) (for sufficiently large n) for any A by the argument above. Therefore, by condition (3), for sufficiently large n,

$$\Pr[\neg Win\_{\mathcal{A}}(n,f)] \ge \Pr[s \in S\_n^{res} \mid s \gets \bigcup \mathcal{S}\_n] \cdot \Pr[\neg Win\_{\mathcal{A}}^{\bullet,res}(n,f)] > r \cdot (1 - \epsilon(n))$$

Therefore, Pr[*Win*A(n, f)] < 1 − r(1 − -(n)) for sufficiently large n.

As a special case where the ratio r is 1, Theorem 1 implies that if a system satisfies SRSCR(f, -, 1) then it is (f, -)-secure.

*Example 3.* Recall the bucketed leaky login program from Example 2. We show that the program satisfies the SRSCR condition. For each <sup>n</sup>, <sup>a</sup> ∈ {0, <sup>1</sup>}*<sup>n</sup>*, and <sup>0</sup> <sup>≤</sup> i<k, let <sup>S</sup>*a,i <sup>n</sup>* ⊆ S*<sup>n</sup>* be the set of secrets whose sub-bits from i · n/k to (i + 1) · n/k − 1 may differ but the remaining n − n/k bits are a (and therefore same). That is,

$$S\_n^{a,i} = \{ s \in \mathcal{S}\_n \mid s[0, \dots, i \cdot n/k - 1] = a[0, \dots, i \cdot n/k - 1] \\
\begin{aligned}
a[0, \dots, i \cdot n/k - 1] &= a[0, \dots, i \cdot n/k - 1] \\
a[i+1] \cdot n/k, \dots, n-1] &= a[(i+1) \cdot n/k, \dots, n-1] \\
\end{aligned} \}$$

Let S*res* be the indexed family of sets of sets of secrets such that S*res <sup>n</sup>* = {S*a,i <sup>n</sup>* <sup>|</sup> <sup>a</sup> ∈ {0, <sup>1</sup>}*<sup>n</sup>*} for some <sup>i</sup>. Then, the system satisfies conditions (1), (2), and (3) of SRSCR(f, -, r) with <sup>r</sup> = 1, <sup>f</sup>(n)=2*n/k* <sup>−</sup> (<sup>N</sup> + 1), and - = 1 <sup>−</sup> *<sup>N</sup>*−<sup>1</sup> 2*n/k* for any 1 <sup>≤</sup> N < <sup>2</sup>*n/k*. Note that (1) is satisfied with <sup>r</sup> = 1 because <sup>S</sup>*<sup>n</sup>* <sup>=</sup> - S*res <sup>n</sup>* , and (2) is satisfied because <sup>|</sup>S*a,i <sup>n</sup>* <sup>|</sup> = 2*n/k* and (f, -) matches the security of the ideal login program without side channels for the set of secrets of size 2*n/k*. To see why (3) is satisfied, note that for any <sup>v</sup> ∈ I*<sup>n</sup>* and <sup>s</sup> <sup>∈</sup> <sup>S</sup>*a,i <sup>n</sup>* , sc*n*(s, v) = i if <sup>s</sup> <sup>=</sup> <sup>v</sup>, and sc*n*(s, v) = <sup>k</sup> if <sup>s</sup> <sup>=</sup> <sup>v</sup>. Hence, for any <sup>v</sup> ∈ I*<sup>n</sup>* and <sup>s</sup>1, s<sup>2</sup> <sup>∈</sup> <sup>S</sup>*a,i <sup>n</sup>* , sc*n*(s1, v) = sc*n*(s2, v) ⇒ rc*n*(s1, v) = rc*n*(s2, v). Therefore, by Theorem 1, it follows that bucketed leaky login program is (f, -)-secure. Note that the bound matches the one given in Example 2.

To effectively apply Theorem 1, one needs to find suitable subsets of secrets S*res* on which the system's regular channel is (f, -)-secure and the side channel satisfies the refinement relation with respect to the regular channel. As also observed in prior works [29,38], the refinement relation is a 2-safety property [13, 35] for which there are a number of effective verification methods [2,6,10,32,34]. For instance, self-composition [3,4,8,35] is a well-known technique that can be used to verify arbitrary 2-safety properties.

We note that a main benefit of Theorem 1 is *separation of concerns* whereby the security of regular channel can be proven independently of side channels, and the conditions required for side channels can be checked separately. For instance, a system designer may prove the regular-channel (f, -)-security by an elaborate manual reasoning, while the side-channel conditions are checked, possibly automatically, by established program verification methods such as self composition. **Remarks.** We make some additional observations regarding the SRSCR condition. First, while Theorem 1 derives a sound security bound, the bound may not be the tightest one. Indeed, when the adversary's error probability (i.e., the "-" part of (f, -)-security) is 1, the bucketed leaky login program can be shown to be actually (k(2*n/k* <sup>−</sup> 2), 1)-secure, whereas the bound derived in Example <sup>3</sup> only showed that it is (2*n/k* <sup>−</sup> <sup>2</sup>, 1)-secure. That is, there is a factor <sup>k</sup> gap in the bounds. Intuitively, the gap occurs for the example because the buckets partition a secret into k number of n/k bit blocks, and while an adversary needs to recover the bits of every block in order to recover the entire secret, the analysis derived the bound by assessing only the effort required to recover bits from one of the blocks. Extending the technique to enable tighter analyses is left for future work.

Secondly, the statement of Theorem 1 says that when regular channel of the system is (f, -)-secure for certain subsets of secrets, then the whole system is (f, - )-secure under certain conditions. This may give an impression that only the adversary-success probability parameter (i.e., -) of (f, -)-security is affected by the additional consideration of side channels, leaving the number of oracle queries parameter (i.e., f) unaffected. However, as also seen in Example 2, the two parameters are often correlated so that smaller f implies smaller and vice versa. Therefore, Theorem 1 suggests that the change in the probability parameter (i.e., from to - ) may need to be compensated by a change in the degree of security with respect to the number of oracle queries.

Finally, condition (2) of SRSCR stipulates that the regular channel is (f, -) secure for each restricted family of sets of secrets <sup>S</sup> <sup>≺</sup> <sup>S</sup>*res* rather than the entire space of secrets S. In general, a system can be less secure when secrets are restricted because the adversary has a smaller space of secrets to search. Indeed, in the case when the error probability is 1, the regular channel of the bucketed leaky login program can be shown to be (2*<sup>n</sup>* <sup>−</sup> <sup>2</sup>, 1)-secure, but when restricted to each <sup>S</sup> <sup>≺</sup> <sup>S</sup>*res* used in the analysis of Example 3, it is only (2*n/k* <sup>−</sup> <sup>2</sup>, 1) secure. That is, there is an implicit correlation between the sizes of the restricted subsets and the degree of regular-channel security. Therefore, finding S*res* such that each <sup>S</sup> <sup>∈</sup> <sup>S</sup>*res <sup>n</sup>* is large and satisfies the conditions is important for deriving good security bounds, even when the ratio | - S*res <sup>n</sup>* |/|S*n*| is large as in the analysis of the bucketed leaky login program.

#### **3.2 Low-Input Side-Channel Non-Interference Condition**

While SRSCR facilitates proving security of systems by separating regular channels from side channels, it requires one to identify suitable subsets of secrets S*res* that satisfy the conditions. This can be a hurdle to applying the proof method. To this end, this section presents a condition, called *low-input sidechannel non-interference* (LISCNI), which guarantees that a system satisfying it becomes secure after applying bucketing (or other techniques) to reduce the number of side-channel outputs. Unlike SRSCR, the condition does not require identifying secret subsets. Roughly, the condition stipulates that the regular channel is secure (for the entire space of secrets) and that the side-channel outputs are independent of attacker-controlled inputs.

We show that the system satisfying the condition becomes a system satisfying SRSCR once bucketing is applied, where the degree of security (i.e., the parameters f, -, r of SRSCR) will be proportional to the degree of regular-channel security and the granularity of buckets. Roughly, this holds because for a system whose side-channel outputs are independent of attacker-controlled inputs, bucketing is guaranteed to partition the secrets into a small number of sets (relative to the bucket granularity) such that for each of the sets, the side channel cannot distinguish the secrets in the set, and the regular-channel security transfers to a certain degree to the case when the secrets are restricted to the ones in the set.

As we shall show next, while the condition is not permissive enough to prove security of the leaky login program (cf. Examples 1, 2 and 3), it covers interesting scenarios such as fast modular exponentiation (cf. Example 4). Also, as we shall show in Sect. 3.3, the condition may be used compositionally in combination with the constant-time implementation technique [1,3,9,22] to further widen its applicability.

**Definition 4 (Low-Input Side-Channel Non-Interference).** Let <sup>f</sup> : <sup>N</sup> <sup>→</sup> N and - : <sup>N</sup> <sup>→</sup> (0, 1]. We say that the system <sup>C</sup> satisfies the *low-input sidechannel non-interference* condition with f and -, written LISCNI(f, -), if the following conditions are satisfied:

(1) C is regular-channel (f, -)-secure; and

(2) For all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>s</sup> ∈ S*n*, and <sup>v</sup>1, v<sup>2</sup> ∈ I*n*, it holds that sc*n*(s, v1) = sc*n*(s, v2).

Condition (2) says that the side-channel outputs are independent of low inputs (i.e., attacker-controlled inputs). We note that this is *non-interference* with respect to low inputs, whereas the usual notion of non-interference says that the outputs are independent of high inputs (i.e., secrets) [19,37].<sup>5</sup>

The LISCNI condition ensures the security of systems after bucketing is applied. We next formalize the notion of "applying bucketing".

**Definition 5 (Bucketing).** Let <sup>C</sup> be a system and <sup>k</sup> <sup>∈</sup> <sup>N</sup> such that k > 0. The system C after k-*bucketing* is applied, written *Bktk*(C), is a system C that satisfies the following:


Roughly, k-bucketing partitions the side channel outputs into k number of buckets. We note that our notion of "bucketing" is quite general in that it does not specify how the side channel outputs are partitioned into the buckets. Indeed, as we shall show next, the security guarantee derived by LISCNI only requires the fact that side channel outputs are partitioned into a small number of buckets.

<sup>5</sup> As with SRSCR, it is easy to relax the notion to be asymptotic so that condition (2) only needs to hold for large n.

This makes our results applicable to any techniques (beyond the usual bucketing technique for timing channels [7,14,26,27,41]) that reduce the number of possible side-channel outputs.

Below states that a system satisfying the LISCNI condition becomes one that satisfies the SRSCR condition after suitable bucketing is applied.

**Theorem 2 (**LISCNI **Soundness).** *Suppose that* C *satisfies* LISCNI(f, -)*. Let* k > 0 *be such that* k · - ≤ 1*. Then, Bktk*(C) *satisfies* SRSCR(f, k · -, 1/k)*.*

*Proof.* Let C = *Bktk*(C). By condition (2) of k-bucketing and condition (2) of LISCNI(f, -), we have that for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>s</sup> ∈ S*<sup>n</sup>* and <sup>v</sup>1, v<sup>2</sup> ∈ I*n*, scC *n*(s, v1) = scC *n*(s, v2). Therefore, by k-bucketing, there must be an indexed family of sets of secrets S such that for all n, (a) S *<sup>n</sup>* ⊆ S*n*, (b) |S *<sup>n</sup>*| ≥ |S*n*|/k, and (c) for all s1, s<sup>2</sup> ∈ S *<sup>n</sup>* and v1, v<sup>2</sup> ∈ I*n*, scC *n*(s1, v1) = scC *n*(s2, v2). Note that such S can be found by, for each n, choosing a bucket into which a maximal number of secrets fall. We define an indexed family of sets of sets of secrets S*res* to be such that S*res <sup>n</sup>* is the singleton set {S *<sup>n</sup>*} for each n.

We show that C satisfies conditions (1), (2), and (3) of SRSCR(f, k · -, 1/k) with the restricted secret subsets S*res* defined above. Firstly, (1) is satisfied because |S *<sup>n</sup>*| ≥ |S*n*|/k. Also, (3) is satisfied because of property (c) above (i.e., the side channel is non-interferent for the subset).

It remains to show that (2) is satisfied. That is, C |*<sup>S</sup>* is regular-channel (f, k · -)-secure. For contradiction, suppose that C |*<sup>S</sup>* is not regular-channel (f, k · -)-secure, that is, there exists a regular-channel attack A that queries (the regular channel of) C |*<sup>S</sup>* at most f(n) many times and successfully recovers the secret with probability at least k · -(n). Then, we can construct a regular-channel adversary for C which simply runs A (on any secret from S*n*). Note that the adversary makes at most f(n) many queries. We argue that the probability that the adversary succeeds in recovering the secret is at least -. That is, we show that Pr[*Win*• <sup>A</sup>(n, f)] ≥ -(n) (for sufficiently large n) where *Win*• <sup>A</sup>(n, f) is the modified game in which the oracle always outputs • as its side-channel output.

To see this, note that the probability that a secret randomly selected from S*<sup>n</sup>* is in S *<sup>n</sup>* is at least 1/k, that is, Pr[s ∈ S *<sup>n</sup>* | s ← S*n*] ≥ 1/k. Also, A's regularchannel attack succeeds with probability at least k · given a randomly chosen secret from S *n*, that is, Pr[*Win*•*,res* <sup>A</sup> (n, f)] <sup>≥</sup> <sup>k</sup> · -(n) where *Win*•*,res* <sup>A</sup> (n, f) is the modified game in which the oracle always outputs • as its side-channel output and the secret is selected randomly from S *<sup>n</sup>* (rather than from S*n*). Therefore, for sufficiently large n, we have:

$$\Pr[\operatorname{Win}\_{\mathcal{A}}^{\bullet}(n,f)] \ge \Pr[s \in S\_n^{\prime} \mid s \gets \mathcal{S}\_n] \cdot \Pr[\operatorname{Win}\_{\mathcal{A}}^{\bullet,res}(n,f)] \ge 1/k \cdot (k \cdot \epsilon(n)) = \epsilon(n)$$

This contradicts condition (1) of LISCNI(f, -) which says that C is regularchannel (f, -)-secure. Therefore, C |*<sup>S</sup>* is regular-channel (f, k · -)-secure.

As a corollary of Theorems 1 and 2, we have the following.

**Corollary 1.** *Suppose that* C *satisfies* LISCNI(f, -)*. Let* k > 0 *be such that* k·- ≤ 1*. Then, Bktk*(C) *is* (f, - )*-secure where* - = 1 − 1/k + -*.*

Note that as k approaches 1 (and hence the system becomes constant-time), the security bound of *Bktk*(C) approaches (f, -), matching the regular-channel security of C. As with Theorem 1, Theorem 2 may give an impression that the conditions only affect the adversary-success probability parameter (i.e., -) of (f, -)-security, leaving the number of queries parameter (i.e., f) unaffected. However, as also remarked in Sect. 3.1, the two parameters are often correlated so that a change in one can affect the other. Also, like SRSCR, LISCNI separates the concerns regarding regular channels from those regarding side channels. A system designer may check the security of the regular channel while disregarding the side channel, and separately prove the condition on the side channel.

**Fig. 2.** Fast modular exponentiation

*Example 4 (Fast Modular Exponentiation).* Fast modular exponentiation is an operation that is often found in cryptography algorithms such as RSA [23,30]. Figure 2 shows its implementation written in a C-like language. It computes y<sup>x</sup> mod m where x is the secret represented as a length n bit array and y is an attacker controlled-input. The program is not constant-time (assuming that then and else branches in the loop have different running times), and effective timing attacks have been proposed for the program [23,30].

However, assuming that running time of the operation (a \* y) % m is independent of y, it can be seen that the program satisfies the LISCNI condition.<sup>6</sup> Under the assumption, the program can be formalized as the system C where, for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>,

– <sup>S</sup>*<sup>n</sup>* <sup>=</sup> <sup>I</sup>*<sup>n</sup>* <sup>=</sup> {0, <sup>1</sup>}*<sup>n</sup>*; – <sup>O</sup>rc *<sup>n</sup>* <sup>=</sup> <sup>O</sup>sc *<sup>n</sup>* = N; – For all (s, v) ∈ S*<sup>n</sup>* × I*n*, rc*n*(s, v) = <sup>v</sup>*<sup>s</sup> mod* <sup>m</sup>; and – For all (s, v) ∈ S*<sup>n</sup>* × I*n*, sc*n*(s, v) = *time*<sup>t</sup> · *num*(s, 1) + *time*<sup>f</sup> · *num*(s, 0).

<sup>6</sup> This is admittedly an optimistic assumption. Indeed, proposed timing attacks exploit the fact that the running time of the operation can depend on y [23,30]. Here, we assume that the running time of the operation is made independent of y by some means (e.g., by adopting the constant-time implementation technique).

Here, *num*(s, b) = |{<sup>i</sup> <sup>∈</sup> <sup>N</sup> <sup>|</sup> i<n <sup>∧</sup> <sup>s</sup>[i] = <sup>b</sup>}| for <sup>b</sup> ∈ {0, <sup>1</sup>}, and *time*<sup>t</sup> (resp. *time*f) is the running time of the then (resp. else) branch.

Let the computation class of adversaries be the class of randomized polynomial time algorithms. Then, under the standard computational assumption that inverting modular exponentiation is hard, one can show that C satisfies LISCNI(f, -) for any f and negligible -. This follows because the side-channel outputs are independent of low inputs, and the regular-channel is (f, -)-secure for any f and negligible under the assumption.<sup>7</sup> Therefore, it can be made (f, -)-secure for any f and negligible by applying bucketing.

**Remarks.** We make some additional observations regarding the LISCNI condition. First, similar to condition (3) of SRSCR, the low-input independence condition of LISCNI (condition (2)) is a 2-safety property and is amenable to various verification methods proposed for the class of properties. In fact, because the condition is essentially side-channel non-interference but with respect to low inputs instead of high inputs, it can be checked by the methods for checking ordinary side-channel non-interference by reversing the roles of high inputs and low inputs [1,3,6,9,20].

Secondly, we note that the leaky login program from Example 1 does not satisfy LISCNI. This is because the program's side channel is not non-interferent with respect to low inputs. Indeed, given any secret s ∈ S*n*, one can vary the running times by choosing low inputs v, v ∈ I*<sup>n</sup>* with differing lengths of matching prefixes, that is, (argmax*<sup>i</sup>* s*<sup>i</sup>* = v*<sup>i</sup>*) = (argmax*<sup>i</sup>* s*<sup>i</sup>* = v *<sup>i</sup>*). Nevertheless, as we have shown in Examples 2 and 3, the program becomes secure once bucketing is applied. In fact, it becomes one that satisfies SRSCR as shown in Example 3. Ideally, we would like to find a relatively simple condition (on systems before bucketing is applied) that covers many systems that would become secure by applying bucketing. However, finding such a condition that covers a system like the leaky login program may be non-trivial. Indeed, predicting that the leaky login program become secure after applying bucketing appears to require more subtle analysis of interaction between low inputs and high inputs. (In fact, it can be shown that arbitrarily partitioning the side-channel outputs to a small number of buckets does not ensure security for this program.) Extending the technique to cover such scenarios is left for future work.

#### **3.3 Combining Bucketing and Constant-Time Implementation Compositionally**

We show that the LISCNI condition may be applied compositionally with the constant-time implementation technique (technically, we will only apply the condition (2) of LISCNI compositionally). As we shall show next, the combined approach is able to ensure security of some non-constant-time systems that cannot

<sup>7</sup> The latter holds because (f, -)-security is asymptotic and the probability that any regular-channel adversary of the computation class may correctly guess the secret for this system is negligible (under the computational hardness assumption). Therefore, a similar analysis can be done for any sub-polynomial number of buckets.

be made sure by applying bucketing globally to the whole system. We remark that, in contrast to those of the previous sections of the paper, the results of this section are more specialized to the case of timing channels. First, we formalize the notion of constant-time implementation.

**Fig. 3.** A non-constant-time program that cannot be made secure by globally applying bucketing.

**Definition 6 (Constant-Time).** Let <sup>f</sup> : <sup>N</sup> <sup>→</sup> <sup>N</sup> and - : <sup>N</sup> <sup>→</sup> (0, 1]. We say that a system C satisfies the *constant-time* condition (or, *timing-channel noninterference*) with f and -, written CT(f, -), if the following is satisfied:


Note that CT requires that the side channel is non-interferent (with respect to secrets). The following theorem is immediate from the definition, and states that CT is a sufficient condition for security.

**Theorem 3 (**CT **Soundness).** *If* C *satisfies* CT(f, -)*, then* C *is* (f, -)*-secure.*

To motivate the combined application of CT and LISCNI, let us consider the following example which is neither constant-time nor can be made secure by (globally) applying bucketing.

*Example 5.* Figure 3 shows a simple, albeit contrived, program that we will use to motivate the combined approach. Here, sec is a n-bit secret and inp is a n-bit attacker-controlled input. Both sec and inp are interpreted as unsigned n-bit integers where − and > are the usual unsigned integer subtraction and comparison operations. The regular channel always outputs true and hence is non-interferent. Therefore, only the timing channel is of concern.

The program can be formalized as <sup>C</sup>comp where for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>,


Note that the side channel outputs the sum of the high input and the low input. It is easy to see that the system is not constant-time (i.e., not CT(f, -) for any f and -). Furthermore, the system is not secure as is, because an adversary can immediately recover the secret by querying with any input and subtracting the input from the side-channel output.

Also, it is easy to see that the system does not satisfy LISCNI(f, -) for any f and either, because its side-channel outputs are not independent of low inputs. In fact, we can show that arbitrarily applying bucketing (globally) to the system does not guarantee security. To see this, let us consider applying bucketing with just two buckets whereby the buckets partition the possible running times in two halves so that running times less than or equal to 2*<sup>n</sup>* fall into the first bucket and those greater than 2*<sup>n</sup>* fall into the other bucket. After applying bucketing, the system is C where


We show that there exists an efficient adaptive attack against C . Let s ∈ S*n*. The adversary A recovers s by only making linearly many queries via the following process. First, <sup>A</sup> queries with the input <sup>v</sup><sup>1</sup> = 2*<sup>n</sup>*−<sup>1</sup>. By observing the side-channel output, <sup>A</sup> will know whether 0 <sup>≤</sup> <sup>s</sup> <sup>≤</sup> <sup>2</sup>*<sup>n</sup>*−<sup>1</sup> (i.e., the side-channel output was 0) or 2*<sup>n</sup>*−<sup>1</sup> < s <sup>≤</sup> <sup>2</sup>*<sup>n</sup>* (i.e., the side-channel output was 1). In the former case, <sup>A</sup> picks the input <sup>v</sup><sup>2</sup> = 2*<sup>n</sup>*−<sup>1</sup> + 2*<sup>n</sup>*−<sup>2</sup> for the next query, and in the latter case, he picks v<sup>2</sup> = 2*<sup>n</sup>*−<sup>2</sup>. Continuing the process in a binary search manner and reducing the space of possible secrets by 1/2 in each query, A is able to hone in on s within n many queries. Therefore, C is not (n, -)-secure for any -.

Next, we present the compositional bucketing approach. Roughly, our compositionality theorem (Theorem 4) states that the sequential composition of a constant-time system with a system whose side channel is non-interferent with respect to low inputs can be made secure by applying bucketing to only the non-constant-time component. As with LISCNI, the degree of security of the composed system is relative to the that of the regular channel and the granularity of buckets.

To state the compositionality theorem, we explicitly separate the conditions on side channels of CT and LISCNI from those on regular channels and introduce terminologies that only refer to the side-channel conditions. Let us fix C. We say that <sup>C</sup> satisfies CTsc, if it satisfies condition (2) of CT, that is, for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, v ∈ I*n*, and s1, s<sup>2</sup> ∈ S*n*, sc*n*(s1, v) = sc*n*(s2, v). Also, we say that C satisfies LISCNIsc if it satisfies condition (2) of LISCNI, that is, for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>s</sup> ∈ S*n*, and v1, v<sup>2</sup> ∈ I*n*, sc*n*(s, v1) = sc*n*(s, v2). Next, we define sequential composition of systems.

**Definition 7 (Sequential Composition).** Let C† and C‡ be systems such that SC† <sup>=</sup> SC‡, IC† <sup>=</sup> IC‡, and for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>O</sup>scC‡*<sup>n</sup>* <sup>⊆</sup> <sup>N</sup> and <sup>O</sup>scC‡*<sup>n</sup>* <sup>⊆</sup> <sup>N</sup>. The *sequential composition* of <sup>C</sup>† with <sup>C</sup>‡, written <sup>C</sup>†; <sup>C</sup>‡, is the system C such that

– SC = S(C†) and IC = I(C†); and – For all <sup>n</sup> <sup>∈</sup> <sup>N</sup> and (s, v) ∈ S*<sup>n</sup>* × I*n*, scC *n*(s, v) = scC†*n*(s, v) + scC‡*n*(s, v).

We note that the definition of sequential composition specifically targets the case when the side channel is a timing channel, and says that the side-channels outputs are numeric values and that the side-channel output of the composed system is the sum of those of the components. Also, the definition leaves the composition of regular channels open, and allows the regular channel of the composed system to be any function from S*<sup>n</sup>* × I*n*. We are now ready to state the compositionality theorem.

**Theorem 4 (Compositionality).** *Let* C† *be a system that satisfies* LISCNIsc *and* C‡ *be a system that satisfies* CTsc*. Suppose that Bktk*(C†); C‡ *is regularchannel* (f, -)*-secure where* k · - ≤ 1*. Then, Bktk*(C†); C‡ *is* (f, - )*-secure, where* - = 1 − 1/k + -*.*

*Proof.* By Theorem 1, it suffices to show that *Bktk*(C†); C‡ satisfies SRSCR(f, k· -, 1/k). By an argument similar to the proof of Theorem 2, there must be an indexed family of sets of secrets <sup>S</sup> such that, for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, (a) <sup>S</sup> *<sup>n</sup>* ⊆ S*n*, (b) |S *<sup>n</sup>*| ≥ |S*n*|/k, and (c) for all s1, s<sup>2</sup> ∈ S *<sup>n</sup>* and v1, v<sup>2</sup> ∈ I*n*, sc*Bktk*(C†)*n*(s1, v1) = sc*Bktk*(C†)*n*(s2, v2). We define an indexed family of sets of sets of secrets S*res* to be such that S*res <sup>n</sup>* is the singleton set {S *<sup>n</sup>*} for each n.

We show that C = *Bktk*(C†); C‡ satisfies conditions (1), (2), and (3) of SRSCR(f, k · -, 1/k) with the restricted secret subsets S*res* defined above. Firstly, (1) is satisfied because |S *<sup>n</sup>*| ≥ |S*n*|/k. Also, because *Bktk*(C†); C‡ is regularchannel (f, -)-secure, we can show that (2) is satisfied by an argument similar to the one in the proof of Theorem 2.

It remains to show that (3) is satisfied. It suffices to show that for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, v ∈ I*n*, and s1, s<sup>2</sup> ∈ S *<sup>n</sup>*, scC*n*(s1, v) = scC*n*(s2, v). That is, the side channel of the composed system is non-interferent (with respect to high inputs) for the subset S . By the definition of the sequential composition, for all v ∈ I*<sup>n</sup>* and s ∈ S*n*, scC*n*(s, v) = sc*Bktk*(C†)*n*(s, v) + scC‡*n*(s, v). Therefore, for all v ∈ I*<sup>n</sup>* and s1, s<sup>2</sup> ∈ S *n*,

$$\begin{array}{lcl} \mathsf{sc}\langle C\rangle\_n(s\_1, v) &= \mathsf{sc}\langle Bkt\_k(C^\dagger)\rangle\_n(s\_1, v) + \mathsf{sc}\langle C^\ddagger\rangle\_n(s\_1, v) \\ &= \mathsf{sc}\langle Bkt\_k(C^\dagger)\rangle\_n(s\_2, v) + \mathsf{sc}\langle C^\ddagger\rangle\_n(s\_2, v) \\ &= \mathsf{sc}\langle C\rangle\_n(s\_2, v) \end{array}$$

because scC‡*n*(s1, v) = scC‡*n*(s2, v) by CTsc of <sup>C</sup>‡, and sc*Bktk*(C†)*n*(s1, v) = sc*Bktk*(C†)*n*(s2, v) by (c) above.

We note that the notion of sequential composition is symmetric. Therefore, Theorem 4 implies that the composing the components in the reverse order, that is, C‡; *Bktk*(C†), is also secure provided that its regular channel is secure.

The compositionality theorem suggests the following compositional approach to ensuring security. Given a system C that is a sequential composition of a component whose side channel outputs are independent of high inputs (i.e., satisfies CTsc) and a component whose side channel outputs are independent of low inputs (i.e., satisfies LISCNIsc), we can ensure the security of C by proving its regular-channel security and applying bucketing only to the nonconstant-time component.

*Example 6.* Let us apply compositional bucketing to the system Ccomp from Example 5. Recall that the system is neither constant-time nor applying bucketing to the whole system ensures its security. The system can be seen as the sequential composition Ccomp = C†; C‡ where C† and C‡ satisfy the following:


Note that C‡ satisfies CTsc as its side-channel outputs are high-input independent, and, C† satisfies LISCNIsc as its side-channel outputs are low-input independent. By applying bucketing only to the component C†, we obtain the system *Bktk*(C†); C‡. The regular-channel of *Bktk*(C†); C‡ (i.e., that of Ccomp) is (f, -)-secure for any f and negligible because it is non-interferent (with respect to high inputs) and the probability that an adversary may recover a secret for such a system is at most 1/|S*n*|. <sup>8</sup> Therefore, by Theorem 4, *Bktk*(C†); C‡ is (f, -)-secure for any f and negligible -.

The above example shows that compositional bucketing can be used to ensure security of non-constant-time systems that cannot be made secure by a wholesystem bucketing. It is interesting to observe that the constant-time condition, CTsc, requires the side-channel outputs to be independent of high inputs but allows dependency on low inputs, while LISCNIsc is the dual and says that the side-channel outputs are independent of low inputs but may depend on high inputs. Our compositionality theorem (Theorem 4) states that a system consisting of such parts can be made secure by applying bucketing only to the part that satisfies the latter condition.

It is easy to see that sequentially composing components that satisfy CTsc results in a system that satisfies CTsc, and likewise, sequentially composing components that satisfy LISCNIsc results in a system that satisfies LISCNIsc. Therefore, such compositions can be used freely in conjunction with the compositional bucketing technique of this section. We also conjecture that components that are made secure by compositional bucketing can themselves be sequentially composed to form a secure system (possibly with some decrease in the degree of security). We leave a more detailed investigation for future work.

<sup>8</sup> Therefore, a similar analysis can be done for any strictly sub-exponential number of buckets.

#### **4 Related Work**

As remarked in Sect. 1, much research has been done on defending against timing attacks and more generally side channel attacks. For instance, there have been experimental evaluation on the effectiveness of bucketing and other timingchannel mitigation schemes [14,18], and other works have proposed informationtheoretic methods for formally analyzing the security of (deterministic and probabilistic) systems against adaptive adversaries [12,25].

However, few prior works have formally analyzed the effect of bucketing on timing channel security (or similar techniques for other side channels) against adaptive adversaries. Indeed, to our knowledge, the only prior work to do so are the series of works by K¨opf et al. [26,27] who investigated the effect of bucketing applied to blinded cryptography algorithms. They show that applying bucketing to a blinded cryptography algorithm whose regular channel is IND-CCA2 secure results in an algorithm that is IND-CCA2 secure against timing-channelobserving adversaries. In addition, they show bounds on information leaked by such bucketed blinded cryptography algorithms in terms of quantitative information flow [5,28,33,39,40]. By contrast, we analyze the effect of applying bucketing to general systems, show that bucketing is in general insufficient against adaptive adversaries, and present novel conditions that guarantee security against such adversaries. (In fact, the results of [26,27] may be seen as an instance of our LISCNI condition because blinding makes the behavior of cryptographic algorithms effectively independent of attacker-controlled inputs.) Also, our results are given in the form of (f, -)-security, which can provide precise bounds on the number of queries needed by adaptive adversaries to recover secrets.

Next, we compare our work with the works on constant-time implementations (i.e., timing-channel non-interference) [1,3,6,9,20,22]. The previous works have proposed methods for verifying that the given system is constant-time [3,6,9,20] or transforming it to one that is constant-time [1,22]. As we have also discussed in this paper (cf. Theorem 3), it is easy to see that the constant-time condition directly transfers the regular-channel-only security to the security for the case with timing channels. By contrast, security implied by bucketing is less straightforward. In this paper, we have shown that bucketing is in general insufficient to guarantee the security of systems even when their regular channel is perfectly secure. And, we have presented results that show that, under certain conditions, the regular-channel-only security can be transferred to the side-channelobserving case to certain degrees. Because there are advantages of bucketing such as efficiency and ease of implementation [7,14,26,27,41], we hope that our results will contribute to a better understanding of the bucketing technique and foster further research on the topic.

#### **5 Conclusion and Future Work**

In this paper, we have presented a formal analysis of the effectiveness of the bucketing technique against adaptive timing-channel-observing adversaries. We have shown that bucketing is in general insufficient against such adversaries, and presented two novel conditions, SRSCR and LISCNI, that guarantee security against such adversaries. SRSCR states that a system that satisfies it is secure, whereas LISCNI states that the a system that satisfies it becomes secure when bucketing is applied. We have shown that both conditions facilitate proving the security of systems against adaptive side-channel-observing adversaries by allowing a system designer to prove the security of the system's regular channel separately from the concerns of its side-channel behavior. By doing so, the security of the regular-channel is transferred, to certain degrees, to the full side-channel-aware security. We have also shown that the LISCNI condition can be used in conjunction with the constant-time implementation technique in a compositional manner to further increase its applicability. We have formalized our results via the notion of (f, -)-security, which gives precise bounds on the number of queries needed by adaptive adversaries to recover secrets.

While we have instantiated our results to timing channel and bucketing, many of the results are actually quite general and are applicable to side channels other than timing channels. Specifically, aside from the compositional bucketing result that exploits the "additive" nature of timing channels, the results are applicable to any side channels and techniques that reduce the number of possible sidechannel observations.

As future work, we would like to extend our results to probabilistic systems. Currently, our results are limited to deterministic systems, and such an extension would be needed to assess the effect of bucketing when it is used together with countermeasure techniques that involve randomization. We would also like to improve the conditions and the security bounds thereof to be able to better analyze systems such as the leaky login program shown in Examples 1, 2 and 3. Finally, we would like to extend the applicability of the compositional bucketing technique by considering more patterns of compositions, such as sequentially composing components that themselves have been made secure by compositional bucketing.

**Acknowledgements.** We thank the anonymous reviewers for useful comments. This work was supported by JSPS KAKENHI Grant Numbers 17H01720 and 18K19787, JSPS Core-to-Core Program, A.Advanced Research Networks, JSPS Bilateral Collaboration Research, and Office of Naval Research (ONR) award #N00014-17-1-2787.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# A Dependently Typed Library for Static Information-Flow Control in Idris

Simon Gregersen(B) , Søren Eller Thomsen, and Aslan Askarov

> Aarhus University, Aarhus, Denmark {gregersen,sethomsen,askarov}@cs.au.dk

Abstract. Safely integrating third-party code in applications while protecting the confidentiality of information is a long-standing problem. Pure functional programming languages, like Haskell, make it possible to enforce lightweight information-flow control through libraries like MAC by Russo. This work presents DepSec, a MAC inspired, dependently typed library for static information-flow control in Idris. We showcase how adding dependent types increases the expressiveness of state-of-theart static information-flow control libraries and how DepSec matches a special-purpose dependent information-flow type system on a key example. Finally, we show novel and powerful means of specifying statically enforced declassification policies using dependent types.

Keywords: Information-flow control · Dependent types · Idris

### 1 Introduction

Modern software applications are increasingly built using libraries and code from multiple third parties. At the same time, protecting confidentiality of information manipulated by such applications is a growing, yet long-standing problem. Third-party libraries could in general have been written by anyone and they are usually run with the same privileges as the main application. While powerful, such privileges open up for abuse.

Traditionally, access control [7] and encryption have been the main means for preventing data dissemination and leakage, however, such mechanisms fall short when third-party code needs access to sensitive information to provide its functionality. The key observation is that these mechanisms only place restrictions on the access to information but not its propagation. Once information is accessed, the accessor is free to improperly transmit or leak the information in some form, either by intention or error.

Language-based Information-Flow Control [36] is a promising technique for enforcing information security. Traditional enforcement techniques analyze how information at different security levels flows within a program ensuring that information flows only to appropriate places, suppressing illegal flows. To achieve this, most information-flow control tools require the design of new languages, compilers, or interpreters (e.g. [12,17,22,23,26,29,39]). Despite a large, growing body of work on language-based information-flow security, there has been little adoption of the proposed techniques. For information-flow policies to be enforced in such systems, the whole system has to be written in new languages – an inherently expensive and time-consuming process for large software systems. Moreover, in practice, it might very well be that only small parts of an application are governed by information-flow policies.

Pure functional programming languages, like Haskell, have something to offer with respect to information security as they strictly separate side-effect free and side-effectful code. This makes it possible to enforce lightweight information-flow control through libraries [11,20,34,35,42] by constructing an embedded domainspecific security sub-language. Such libraries enforce a secure-by-construction programming model as any program written against the library interface is not capable of leaking secrets. This construction forces the programmer to write security-critical code in the sub-language but otherwise allows them to freely interact and integrate with non-security critical code written in the full language. In particular, static enforcement libraries like MAC [34] are appealing as no run-time checks are needed and code that exhibits illegal flows is rejected by the type checker at compile-time. Naturally, the expressiveness of Haskell's type system sets the limitation on which programs can be deemed secure and which information flow policies can be guaranteed.

Dependent type theories [24,31] are implemented in many programming languages such as Coq [13], Agda [32], Idris [8], and F<sup>∗</sup> [44]. Programming languages that implement such theories allow types to dependent on values. This enables programmers to give programs a very precise type and increased confidence in its correctness.

In this paper, we show that dependent types provide a direct and natural way of expressing precise data-dependent security policies. Dependent types can be used to represent rich security policies in environments like databases and data-centric web applications where, for example, new classes of users and new kinds of data are encountered at run-time and the security level depends on the manipulated data itself [23]. Such dependencies are not expressible in less expressive systems like MAC. Among other things, with dependent types, we can construct functions where the security level of the output depends on an argument:

#### getPassword : (u : Username) -> Labeled u String

Given a user name u, getPassword retrieves the corresponding password and classifies it at the security level of u. As such, we can express much more precise security policies that can depend on the manipulated data.

Idris is a general-purpose functional programming language with fullspectrum dependent types, that is, there is no restrictions on which values may appear in types. The language is strongly influenced by Haskell and has, among others, inherited its strict encapsulation of side-effects. Idris essentially asks the question: "What if Haskell had full dependent types?" [9]. This work, essentially, asks:

"What if MAC had full dependent types?"

We address this question using Idris because of its positioning as a generalpurpose language rather than a proof assistant. All ideas should be portable to equally expressive systems with full dependent types and strict monadic encapsulation of side-effects.

In summary, the contributions of this paper are as follows.


*Outline.* The rest of the paper proceeds through a presentation of the DepSec library (Sect. 2); a conference manager case study (Sect. 3) and the introduction of policy-parameterized functions (Sect. 4) both showcasing the expressiveness of DepSec; means to specify statically-ensured declassification policies (Sect. 5); soundness of the core library (Sect. 6); and related work (Sect. 7).

All code snippets presented in the following are extracts from the source code. All source code is implemented in Idris 1.3.1. and available at

https://github.com/simongregersen/DepSec.

#### 1.1 Assumptions and Threat Model

In the rest of this paper, we require that code is divided up into trusted code, written by someone we trust, and untrusted code, written by a potential attacker. The trusted computing base (TCB) has no restrictions, but untrusted code does not have access to modules providing input/output behavior, the data constructors of the domain specific language and a few specific functions related to declassification. In Idris, this means that we specifically do not allow access to IO functions and unsafePerformIO. In DepSec, constructors and functions marked with a TCB comment are inaccessible to untrusted code. Throughout the paper we will emphasize when this is the case.

We require that all definitions made by untrusted code are total, that is, defined for all possible inputs and are guaranteed to terminate. This is necessary if we want to trust proofs given by untrusted code. Otherwise, it could construct an element of the empty type from which it could prove anything:

empty : Void empty = empty

In Idris, this can be checked using the --total compiler flag. Furthermore, we do not consider concurrency nor any internal or termination covert channels.

### 2 The DepSec Library

In information-flow control, labels are used to model the sensitivity of data. Such labels usually form a security lattice [14] where the induced partial ordering - specifies allowed flows of information and hence the security policy. For example, -<sup>1</sup> - -<sup>2</sup> specifies that data with label -<sup>1</sup> is allowed to flow to entities with label -<sup>2</sup>. In DepSec, labels are represented by values that form a verified join semilattice implemented as Idris interfaces<sup>1</sup>. That is, we require proofs of the lattice properties when defining an instance of JoinSemilattice.

```
interface JoinSemilattice a where
 join : a -> a -> a
 associative :
   (x, y, z : a) -> x `join` (y `join` z) = (x `join` y) `join` z
 commutative : (x, y : a) -> x `join` y = y `join` x
 idempotent : (x : a) -> x `join` x = x
```
Dependent function types (often referred to as Π types) in Idris can express such requirements. If A is a type and B is a type indexed by a value of type A then (x : A) -> B is the type of functions that map arguments x of type A to values of type B x.

A lattice induces a partial ordering, which gives a direct way to express flow constraints. We introduce a verified partial ordering together with an implementation of this for JoinSemilattice. That is, to define an instance of the Poset interface we require a concrete instance of an associated data type leq as well as proofs of necessary algebraic properties of leq.

```
interface Poset a where
  leq : a -> a -> Type
  reflexive : (x : a) -> x `leq` x
  antisymmetric : (x, y : a) -> x `leq` y -> y `leq` x -> x = y
  transitive : (x, y, z : a) -> x `leq` y -> y `leq` z -> x `leq` z
implementation JoinSemilattice a => Poset a where
  leq x y = (x `join` y = y)
  ...
```
This definition allows for generic functions to impose as few restrictions as possible on the user while being able to exploit the algebraic structure in proofs, as will become evident in Sects. 3 and 4. For the sake of the following case studies, we also have a definition of a BoundedJoinSemilattice requiring a least element Bottom of an instance of JoinSemilattice and a proof of the element being the unit.

<sup>1</sup> Interfaces in Idris are similar to type classes in Haskell.

Fig. 1. Type signature of the core DepSec API.

*The Core API.* Figure 1 presents the type signature of DepSec's core API. Notice that names beginning with a lower case letter that appear as a parameter or index in a type declaration will be automatically bound as an implicit argument in Idris, and the auto annotation on implicit arguments means that Idris will attempt to fill in the implicit argument by searching the calling context for an appropriate value.

Abstract data type Labeled a denotes a value of type a with sensitivity level -. We say that Labeled a is *indexed* by and *parameterized* by a. Abstract data type DIO a denotes a secure computation that handles values with sensitivity level and results in a value of type a. It is internally represented as a wrapper around the regular IO monad that, similar to the one in Haskell, can be thought of as a state monad where the state is the entire world. Notice that both data constructors MkLabeled and MkDIO are not available to untrusted code as this would allow pattern matching and uncontrolled unwrapping of protected entities. As a consequence, we introduce functions label and unlabel for labeling and unlabeling values. Like Rajani and Garg [33], but unlike MAC, the type signature of label imposes no lattice constraints on the computation context. This does not leak information as, if l l and a computation c has type DIO l (Labeled l V ) for any type V , then there is no way for the labeled return value of c to escape the computation context with label l .

As in MAC, the API contains a function plug that safely integrates sensitive computations into less sensitive ones. This avoids the need for nested computations and *label creep*, that is, the raising of the current label to a point where the computation can no longer perform useful tasks [34,47]. Finally, we also have functions run and lift that are only available to trusted code for unwrapping of the DIO monad and lifting of the IO monad into the DIO monad.

*Labeled Resources.* Data type Labeled a is used to denote a labeled Idris value with type a. This is an example of a *labeled resource* [34]. By itself, the core library does not allow untrusted code to perform any side effects but we can safely incorporate, for example, file access and mutable references as other labeled resources. Figure 2 presents type signatures for files indexed by security levels used for secure file handling while mutable references are available in the accompanying source code. Abstract data type SecFile denotes a secure file with sensitivity level -. As for Labeled a, the data constructor MkSecFile is not available to untrusted code.

The function readFile takes as input a secure file SecFile l' and returns a computation with sensitivity level l that returns a labeled value with sensitivity level l'. Notice that the l l' flow constraint is required to enforce the *no read-up* policy [7]. That is, the result of the computation returned by readFile only involves data with sensitivity at most l. The function writeFile takes as input a secure file SecFile l'' and a labeled value of sensitivity level l', and it returns a computation with sensitivity level l that returns a labeled value with sensitivity level l''. Notice that both the l l' and l' l'' flow constraints are required, essentially enforcing the *no write-down* policy [7], that is, the file never receives data more sensitive than its sensitivity level.

Finally, notice that the standard library functions for reading and writing files in Idris used to implement the functions in Fig. 2 do not raise exceptions. Rather, both functions return an instance of the sum type Either. We stay consistent with Idris' choice for this instead of adding exception handling as done in MAC.

Fig. 2. Type signatures for secure file handling.

#### 3 Case Study: Conference Manager System

This case study showcases the expressiveness of DepSec by reimplementing a conference manager system with a fine-grained data-dependent security policy introduced by Lourenço and Caires [23]. Lourenço and Caires base their development on a minimal λ-calculus with references and collections and they show how secure operations on relevant scenarios can be modelled and analysed using *dependent information flow types*. Our reimplementation demonstrates how DepSec matches the expressiveness of such a special-purpose built dependent type system on a key example.

In this scenario, a user is either a regular user, an author user, or a program committee (PC) member. The conference manager contains information about the users, their submissions, and submission reviews. This data is stored in lists of references to records, and the goal is to statically ensure, by typing, the confidentiality of the data stored in the conference manager system. As such, the security policy is:


To achieve this security policy, Lourenço and Caires make use of indexed security labels [22]. The security level *U* is partitioned into a number of security compartments such that *U* (*uid*) represents the compartment of the registered user with id *uid*. Similarly, the security level *A* is indexed such that *A*(*uid*, *sid*) stands for the compartment of data belonging to author *uid* and their submission *sid*, and *PC* is indexed such that *PC* (*uid*, *sid*) stands for data belonging to the PC member with user id *uid* assigned to review the submission with id *sid*. Furthermore, levels and <sup>⊥</sup> are introduced such that, for example, *<sup>U</sup>* (⊥) - *<sup>U</sup>* (*uid*) -*<sup>U</sup>* (). Now, the security lattice is defined using two equations:

$$\forall uid, sid. \ U(uid) \sqsubseteq A(uid, sid) \tag{1}$$

$$\forall uid1, uid2, sid. \ A(uid1, sid) \sqsubseteq PC(uid2, sid) \tag{2}$$

Lourenço and Caires are able to type a list of submissions with a dependent sum type that assigns the content of the paper the security level *A*(*uid*, *sid*), where *uid* and *sid* are fields of the record. For example, if a concrete submission with identifier 2 was made by the user with identifier 1, the content of the paper gets classified at security level *<sup>A</sup>*(*<sup>1</sup>* , *<sup>2</sup>* ). In consequence, *<sup>A</sup>*(*<sup>1</sup>* , *<sup>2</sup>* ) - *PC* (*n*, *2* ) for any *uid* n and the content of the paper is only observable by its assigned reviewers. Similar types are given for the list of user information and the list of submission reviews, enforcing the security policy described in the above.

To express this policy in DepSec, we introduce abstract data types Id and Compartment (cf. Fig. 3) followed by an implementation of the BoundedJoinSemilattice interface that satisfies Eqs. (1) and (2).

58 S. Gregersen et al.


Fig. 3. Abstract data types for the conference manager sample security lattice.

Fig. 4. Conference manager types encoded with DepSec.

Using the above, the required dependent sum types can easily be encoded with DepSec in Idris as presented in Fig. 4. With these typings in place, implementing the examples from Lourenço and Caires [23] is straightforward. For example, the function viewAuthorPapers takes as input a list of submissions and a user identifier uid1 from which it returns a computation that returns a list of submissions authored by the user with identifier uid1. Notice that uid denotes the automatically generated record projection function that retrieves the field uid of the record, and that (x: A \*\* B) is notation for a dependent pair (often referred to as a Σ type) where A and B are types and B may depend on x.

```
viewAuthorPapers : Submissions
                -> (uid1 : Id)
                -> DIO Bottom (List (sub : Submission ** uid1 = (uid sub)))
```
The addCommentSubmission operation is used by the PC members to add comments to the submissions. The function takes as input a list of reviews, a user identifier of a PC member, a submission identifier, and a comment with label A uid1 sid1. It returns a computation that updates the PC\_only field in the review of the paper with identifier sid1.

```
addCommentSubmission : Reviews -> (uid1 : Id) -> (sid1 : Id)
                    -> Labeled (A uid1 sid1) String
                    -> DIO Bottom ()
```
Notice that to implement this specific type signature, up-classification is necessary to assign the comment with type Labeled (A uid1 sid1) String to a field with type Labeled (PC uid sid1) String. This can be achieved soundly with the relabel primitive introduced by Vassena et al. [47] as A uid1 sid1 - PC uid sid1. We include this primitive in the accompanying source code together with several other examples. The entire case study amounts to about 300 lines of code where half of the lines implement and verify the lattice.

#### 4 Policy-Parameterized Functions

A consequence of using a dependently typed language, and the design of DepSec, is that functions can be defined such that they abstract over the security policy while retaining precise security levels. This makes it possible to reuse code across different applications and write other libraries on top of DepSec. We can exploit the existence of a lattice join, the induced poset, and their verified algebraic properties to write such functions.

Fig. 5. Reading two files to a string labeled with the join of the labels of the files.

Figure 5 presents the function readTwoFiles that is parameterized by a bounded join semilattice. It takes two secure files with labels l and l' as input and returns a computation that concatenates the contents of the two files labeled with the join of l and l'. To implement this, we make use of the unlabel and readFile primitives from Figs. 1 and 2, respectively. This computation unlabels the contents of the files and returns the concatenation of the contents if no file error occurred. Notice that pure is the Idris function for monadic return, corresponding to the return function in Haskell. Finally, this computation is plugged into the surrounding computation. Notice how the usage of readFile and unlabel introduces several proof obligations, namely ⊥ l, l', l l' and l, l' l l'. When working on a concrete lattice these obligations are usually fulfilled by Idris' automatic proof search but, currently, such proofs need to be given manually in the general case. All obligations follow immediately from the algebraic properties of the bounded semilattice and are given in three auxiliary lemmas leq\_bot\_x, join\_x\_xy, and join\_y\_xy available in the accompanying source code (amounting to 10 lines of code).

Writing functions operating on a fixed number of resources is limiting. However, the function in Fig. 5 can easily be generalized to a function working on an arbitrary data structure containing files with different labels from an arbitrary lattice. Similar to the approach taken by Buiras et al. [11] that hide the label of a labeled value using a data type definition, we hide the label of a secure file with a dependent pair

```
GenFile : Type -> Type
GenFile label = (l : label ** SecFile l)
```
that abstracts away the concrete sensitivity level of the file. Moreover, we introduce a specialized join function

```
joinOfFiles : BoundedJoinSemilattice label
           => List (GenFile label)
           -> label
```
that folds the join function over a list of file sensitivity labels. Now, it is possible to implement a function that takes as input a list of files, reads the files, and returns a computation that concatenates all their contents (if no file error occurred) where the return value is labeled with the join of all their sensitivity labels.

```
readFiles : BoundedJoinSemilattice a
         => (files: (List (GenFile a)))
         -> DIO Bottom (Labeled (joinOfFiles files)
                         (Either (List FileError) String))
```
When implementing this, one has to satisfy non-trivial proof obligations as, for example, that l joinOfFiles(files) for all secure files f ∈ files where the label of f is l. While provable (in 40 lines of code in our development), if equality is decidable for elements of the concrete lattice we can postpone such proof obligations to a point in time where it can be solved by reflexivity of equality. By defining a decidable lattice order

```
decLeq : JoinSemilattice a => DecEq a => (x, y : a) -> Dec (x `leq` y)
decLeq x y = decEq (x `join` y) y
```
we can get such a proof "for free" by inserting a dynamic check of whether the flow is allowed. With this, a readFiles' function with the exact same functionality as the original readFiles function can be implemented with minimum effort. In the below, prf is the proof that the label l of file may flow to joinOfFiles files.

```
readFiles' : BoundedJoinSemilattice a => DecEq a
          => (files: (List (GenFile a)))
          -> DIO Bottom (Labeled (joinOfFiles files)
                          (Either (List FileError) String))
readFiles' files =
  ...
  case decLeq l (joinOfFiles files) of
    Yes prf => ...
    No _ => ...
```
The downside of this is the introduction of a negative case, the No-case, that needs handling even though it will never occur if joinOfFiles is implemented correctly.

In combination with GenFile, decLeq can be used to implement several other interesting examples. For instance, a function that reads all files with a sensitivity label below a certain label to a string labeled with that label. The accompanying source code showcases multiple such examples that exploit decidable equality.

### 5 Declassification

Realistic applications often release some secret information as part of their intended behavior; this action is known as *declassification*.

In DepSec, trusted code may declassify secret information without adhering to any security policy as trusted code has access to both the DIO a and Labeled a data constructors. However, only giving trusted code the power of declassification is limiting as we want to allow the use of third-party code as much as possible. The main challenge we address is how to grant untrusted code the right amount of power such that declassification is only possible in the intended way.

Sabelfeld and Sands [38] identify four dimensions of declassification: *what*, *who*, *where*, and *when*. In this section, we present novel and powerful means for static declassification with respect to three of the four dimensions and illustrate these with several examples. To statically enforce different declassification policies we take the approach of Sabelfeld and Myers [37] and use escape hatches, a special kind of functions. In particular, we introduce the notion of a *hatch builder* ; a function that creates an escape hatch for a particular resource and which can only be used when a certain condition is met. Such an escape hatch can therefore be used freely by untrusted code.

### 5.1 The *what* Dimension

Declassification policies related to the *what* dimension place restrictions on exactly "what" and "how much" information is released. It is in general difficult to statically predict how data to be declassified is manipulated or changed by programs [35] but exploiting dependent types can get us one step closer.

To control what information is released, we introduce the notion of a *predicate hatch builder* only available to trusted code for producing hatches for untrusted code.

```
predicateHatchBuilder : Poset lt => {l, l' : lt} -> {D, E : Type}
                     -> (d : D)
                     -> (P : D -> E -> Type)
                     -> (d : D ** Labeled l (e : E ** Pde)
                                  -> Labeled l' E) -- TCB
```
Intuitively, the hatch builder takes as input a data structure d of type D followed by a predicate P upon d and something of type E. It returns a dependent pair of the initial data structure and a declassification function from sensitivity level l to l'. To actually declassify a labeled value e of type E one has to provide a proof that Pde holds. Notice that this proof may be constructed in the context of the sensitivity level l that we are declassifying from.

The reason for parameterizing the predicate P by a data structure of type D is to allow declassification to be restricted to a specific context or data structure. This is used in the following example of an auction system, in which only the highest bid of a specific list of bids can be declassified.

*Example.* Consider a two point lattice where L - H, H - L and an auction system where participants place bids secretly. All bids are labeled H and are put into a data structure BidLog. In the end, we want only the winning bid to be released and hence declassified to label L. To achieve this, we define a declassification predicate HighestBid.

HighestBid : BidLog -> Bid -> Type HighestBid = \log, b => (Elem (label b) log, MaxBid b log)

Informally, given a log log of labeled bids and a bid b, the predicate states that the bid is in the log, Elem (label b) log, and that it is the maximum bid, MaxBid b log. We apply predicateHatchBuilder to a log of bids and the HighestBid predicate to obtain a specialized escape hatch of type BidHatch that enforces the declassification policy defined by the predicate.

```
BidHatch : Type
BidHatch = (log : BidLog ** Labeled H (b : Bid ** HighestBid log b)
                            -> Labeled L Bid)
```
This hatch can be used freely by untrusted code when implementing the auction system. By constructing a function

getMaxBid : (r : BidLog) -> DIO H (b : Bid \*\* HighestBid r b)

untrusted code can plug the resulting computation into an L context and declassify the result value using the argument hatch function.

```
auction : BidHatch -> DIO L (Labeled L Bid)
auction ([] ** _) = pure $ label ("no bids", 0)
auction (r :: rs ** hatch) =
  do max <- plug (getMaxBid (r :: rs))
    let max' : Labeled L Bid = hatch max
    ...
```
To show the HighestBid predicate (which in our implementation comprises 40 lines of code), untrusted code will need a generalized unlabel function that establishes the relationship between label and the output of unlabel. The only difference is its return type: a computation that returns a value and a proof that when labeling this value we will get back the initial input. This definition poses no risk to soundness as the proof is protected by the computation sensitivity level.

```
unlabel' : Poset lt => {l,l': lt}
        -> {auto flow: l `leq` l'}
        -> (labeled: Labeled l a)
        -> DIO l' (c : a ** label c = labeled)
```
*Limiting Hatch Usage.* Notice how escape hatches, generally, can be used an indefinite number of times. The Control.ST library [10] provides facilities for creating, reading, writing, and destroying state in the type of Idris functions and, especially, allows tracking of state change in a function type. This allows us to limit the number of times a hatch can be used. Based on a concept of resources, a dependent type STrans tracks how resources change when a function is invoked. Specifically, a value of type STrans m returnType in\_res out\_res represents a sequence of actions that manipulate state where m is an underlying computation context in which the actions will be executed, returnType is the return type of the sequence, in\_res is the required list of resources available before executing the sequence, and out\_res is the list of resources available after executing the sequence.

To represent state transitions more directly, ST is a type level function that computes an appropriate STrans type given a underlying computation context, a result type, and a list of *actions*, which describe transitions on resources. Actions can take multiple forms but the one we will make use of is of the form lbl ::: ty\_in :-> ty\_out that expresses that the resource lbl begins in state ty\_in and ends in state ty\_out. By instantiating ST with DIO l as the underlying computation context:

```
DIO' : l -> (ty : Type) -> List (Action ty) -> Type
DIO' l = ST (DIO l)
```
and use it together with a resource Attempts, we can create a function limit that applies its first argument f to its second argument arg with Attempts (S n) as its initial required state and Attempts n as the output state.

```
limit : (f : a -> b) -> (arg : a)
     -> DIO' l b [attempts ::: Attempts (S n) :-> Attempts n]
```
That is, we encode that the function consumes "an attempt." With the limit function it is possible to create functions where users are forced, by typing, to specify how many times it is used.

As an example, consider a variant of an example by Russo et al. [35] where we construct a specialized hatch passwordHatch that declassifies the boolean comparison of a secret number with an arbitrary number.

64 S. Gregersen et al.

```
passwordHatch : (labeled : Labeled H Int)
             -> (guess : Int)
             -> DIO' l Bool [attempts ::: Attempts (S n) :-> Attempts n]
passwordHatch (MkLabeled v) = limit (\g => g == v)
```
To use this hatch, untrusted code is forced to specify how many times it is used.

```
pwCheck : Labeled H Int
       -> DIO' L () [attempts ::: Attempts (3 + n) :-> Attempts n]
pwCheck pw =
  do x1 <- passwordHatch pw 1
     x2 <- passwordHatch pw 2
     x3 <- passwordHatch pw 3
     x4 <- passwordHatch pw 4 -- type error!
     ...
```
### 5.2 The *who* and *when* Dimensions

To handle declassification policies related to *who* may declassify information and *when* declassification may happen we introduce the notion of a *token hatch builder* only available to trusted code for producing hatches for untrusted code to use.

```
tokenHatchBuilder : Poset labelType => {l, l' : labelType} -> {E, S : Type}
                 -> (Q : S -> Type)
                 -> (s : S ** Q s) -> Labeled l E -> Labeled l' E -- TCB
```
The hatch builder takes as input a predicate Q on something of type S and returns a declassification function from sensitivity level l to l' given that the user can prove the existence of some s such that Q s holds. As such, by limiting when and how untrusted can obtain a value that satisfy predicate Q, we can construct several interesting declassification policies.

The rest of this section discusses how predicate hatches can be used for timebased and authority-based control of declassification; the use of the latter is demonstrated on a case study.

*Time-Based Hatches.* To illustrate the idea of token hatches for the *when* dimension of declassification, consider the following example. Let Time be an abstract data type with a data constructor only available to trusted code and tick : DIO l Time a function that returns the current system time wrapped in the Time data type such that this is the only way for untrusted code to construct anything of type Time. Notice that this does not expose an unrestricted timer API as untrusted code can not inspect the actual value.

Now, we instantiate the token hatch builder with a predicate that demands the existence of a Time token that is greater than some specific value.

TimeHatch : Time -> Type TimeHatch t = (t' \*\* t <= t' = True) -> Labeled H Nat -> Labeled L Nat As such, TimeHatch t can only be used after a specific point in time t has passed as only then untrusted code will be able to satisfy the predicate.

```
timer : Labeled H Nat -> TimeHatch t -> DIO L ()
timer secret {t} timeHatch =
  do time <- tick
     case decEq (t <= time) True of
       Yes prf =>
         let declassified : Labeled L Nat = timeHatch (time ** prf) secret
         ...
       No _ => ...
```
*Authority-Based Hatches.* The *Decentralized Labeling Model* (DLM) [27] marks data with a set of principals who owns the information. While executing a program, the program is given *authority*, that is, it is authorized to act on behalf of some set of principals. Declassification simply makes a copy of the released data and marks it with the same set of principals but excludes the authorities.

Similarly to Russo et al. [35], we adapt this idea such that it works on a security lattice of Principals, assign authorities with security levels from the lattice, and let authorities declassify information at that security level.

To model this, we define the abstract data type Authority with a data constructor available only to trusted code so that having an instance of Authority s corresponds to having the authority of the principal s. Notice how assignment of authorities to pieces of code consequently is a part of the trusted code. Now, we instantiate the token hatch builder with a predicate that demands the authority of s to declassify information at that level.

```
authHatch : { l, l' : Principal }
         -> (s ** (l = s, Authority s))
         -> Labeled l a -> Labeled l' a
authHatch {l} = tokenHatchBuilder (\s => (l = s, Authority s))
```
That is, authHatch makes it possible to declassify information at level l to l' given an instance of the Authority l data type.

*Example.* Consider the scenario of an online dating service that has the distinguishing feature of allowing its users to specify the visibility of their profiles at a fine-grained level. To achieve this, the service allows users to provide a *discovery agent* that controls their visibility. Consider a user, Bob, whose implementation of the discovery agent takes as input his own profile and the profile of another user, say Alice. The agent returns a possibly side-effectful computation that returns an option type indicating whether Bob wants to be discovered by Alice. If that is the case, a profile is returned by the computation with the information about Bob that he wants Alice to be able to see. When Alice searches for candidate matches, her profile is run against the discovery agents of all candidates and the result is added to her browsing queue.

To implement this dating service, we define the record type ProfileInfo A that contains personal information related to principal A.

```
record ProfileInfo (A : Principal) where
 constructor MkProfileInfo
 name : Labeled A String
 gender : Labeled A String
 birthdate : Labeled A String
 ...
```
The interesting part of the dating service is the implementation of discovery agents. Figure 6 presents a sample discovery agent that matches all profiles with the opposite gender and only releases information about the name and gender. The discovery agent demands the authority of A and takes as input two profiles a : ProfileInfo A and b : ProfileInfo B. The resulting computation security level is B so to incorporate information from a into the result, declassification is needed. This is achieved by providing authHatch with the authority proof of A. The discovery agent sampleDiscoverer in Fig. 6 unlabels B's gender, declassifies and unlabels A's gender and name, and compares the two genders. If the genders match, a profile with type ProfileInfo B only containing the name and gender of A is returned. Otherwise, Nothing is returned indicating that A does not want to be discovered. Notice that Refl is the constructor for the built-in equality type in Idris and it is used to construct the proof of equality between principals required by the hatch.

Fig. 6. A discovery agent that matches with all profiles of the opposite gender and only releases the name and gender.

### 6 Soundness

Recent works [46,47] present a mechanically-verified model of MAC and show progress-insensitive noninterference (PINI) for a sequential calculus. We use this work as a starting point and discuss necessary modification in the following. Notice that this work does not consider any declassification mechanisms and neither do we; we leave this as future work.

The proof relies on the *two-steps erasure* technique, an extension of the *term erasure* [21] technique that ensures that the same public output is produced if secrets are erased before or after program execution. The technique relies on a type-driven erasure function ε-*<sup>A</sup>* on terms and configurations where -<sup>A</sup> denotes the attacker security level. A configuration consists of an --indexed compartmentalized store Σ and a term t. A configuration Σ, t is erased by erasing t and by erasing Σ pointwise, i.e. ε-*<sup>A</sup>* (Σ) = λ-.ε-*<sup>A</sup>* (Σ(-)). On terms, the function essentially rewrites data and computations above -<sup>A</sup> to a special • value. The full definition of the erasure function is available in the full version of this paper [15]. From this definition, the definition of low-equivalence of configurations follows.

Definition 1. *Let* c<sup>1</sup> *and* c<sup>2</sup> *be configurations.* c<sup>1</sup> *and* c<sup>2</sup> *are said to be* -A*equivalent, written* c<sup>1</sup> ≈-*<sup>A</sup>* c2*, if and only if* ε-*<sup>A</sup>* (c1) ≡ ε-*<sup>A</sup>* (c2)*.*

After defining the erasure function, the noninterference theorem follows from showing a *single-step simulation* relationship between the erasure function and a small-step reduction relation: erasing sensitive data from a configuration and then taking a step is the same as first taking a step and then erasing sensitive data. This is the content of the following proposition.

Proposition 1. *If* <sup>c</sup><sup>1</sup> <sup>≈</sup>-*<sup>A</sup>* c2*,* c<sup>1</sup> → c <sup>1</sup>*, and* c<sup>2</sup> → c <sup>2</sup> *then* c <sup>1</sup> ≈-*<sup>A</sup>* c 2*.*

The main theorem follows by repeated applications of Proposition 1.

Theorem 1 (PINI). *If* <sup>c</sup><sup>1</sup> <sup>≈</sup>-*<sup>A</sup>* c2*,* c<sup>1</sup> ⇓ c <sup>1</sup>*, and* c<sup>2</sup> ⇓ c <sup>2</sup> *then* c <sup>1</sup> ≈-*<sup>A</sup>* c 2*.*

Both the statement and the proof of noninterference for DepSec are mostly similar to the ones for MAC and available in the full version of this paper [15]. Nevertheless, one has to be aware of a few subtleties.

First, one has to realize that even though dependent types in a language like Idris may depend on data, the data itself is not a part of a value of a dependent type. Recall the type Vect n Nat of vectors of length n with components of type Nat and consider the following program.

length : Vect n a -> Nat length {n = n} xs = n

This example may lead one to believe that it is possible to extract data from a dependent type. This is *not* the case. Both n and a are implicit arguments to the length function that the compiler is able to infer. The actual type is

length : {n : Nat} -> {a : Type} -> Vect n a -> Nat

As a high-level dependently typed functional programming language, Idris is elaborated to a low-level core language based on dependent type theory [9]. In the elaboration process, such implicit arguments are made explicit when functions are defined and inferred when functions are invoked. This means that in the underlying core language, only explicit arguments are given. Our modeling given in the full version of this paper reflects this fact soundly.

Second, to model the extended expressiveness of DepSec, we extend both the semantics and the type system with compile-time pure-term reduction and higher-order dependent types. These definitions are standard (defined for Idris by Brady [9]) and available in the full version of our paper. Moreover, as types now become first-class terms, the definition of ε-*<sup>A</sup>* has to be extended to cover the new kinds of terms. As before, primitive types are unaffected by the erasure function, but dependent and indexed types, such as the type DIO, have to be erased homomorphically, e.g., ε-*<sup>A</sup>* (DIO τ : Type) - DIO ε-*<sup>A</sup>* (-) ε-*<sup>A</sup>* (τ ). The intuition of why this is sensible comes from the observation that indexed dependent types considered as terms may contain values that will have to be erased. This is purely a technicality of the proof. If defined otherwise, the erasure function would not commute with capture-avoiding substitution on terms, ε-*<sup>A</sup>* (t[v/x]) = ε-*<sup>A</sup>* (t)[ε-*<sup>A</sup>* (v)/x], which is vital for the remaining proof.

### 7 Related Work

*Security Libraries.* The pioneering and formative work by Li and Zdancewic [20] shows how *arrows* [18], a generalization of monads, can provide information-flow control without runtime checks as a library in Haskell. Tsai et al. [45] further extend this work to handle side-effects, concurrency, and heterogeneous labels. Russo et al. [35] eliminate the need for arrows and implement the security library SecLib in Haskell based solely on monads. Rather than labeled values, this work introduces a monad which statically label side-effect free values. Furthermore, it presents combinators to dynamically specify and enforce declassification policies that bear a resemblance to the policies that DepSec are able to enforce statically.

The security library LIO [41,42] dynamically enforces information-flow control in both sequential and concurrent settings. Stefan et al. [40] extend the security guarantees of this work to also cover exceptions. Similar to this work, Stefan et al. [42] present a simple API for implementing secure conference reviewing systems in LIO with support for data-dependent security policies.

Inspired by the design of SecLib and LIO, Russo [34] introduces the security library MAC. The library statically enforces information-flow control in the presence of advanced features like exceptions, concurrency, and mutable data structures by exploiting Haskell's type system to impose flow constraints. Vassena and Russo [46], Vassena et al. [47] show progress-insensitive noninterference for MAC in a sequential setting and progress-sensitive noninterference in a concurrent setting, both using the two-steps erasure technique.

The flow constraints enforcing confidentiality of read and write operations in DepSec are identical to those of MAC. This means that the examples from MAC that do not involve concurrency can be ported directly to DepSec. To the best of our knowledge, data-dependent security policies like the one presented in Sect. 3 cannot be expressed and enforced in MAC, unlike LIO that allows such policies to be enforced dynamically. DepSec allows for such security policies to be enforced statically. Moreover, Russo [34] does not consider declassification. To address the static limitations of MAC, HLIO [11] takes a hybrid approach by exploiting advanced features in Haskell's type-system like singleton types and constraint polymorphism. Buiras et al. [11] are able to statically enforce information-flow control while allowing selected security checks to be deferred until run-time.

*Dependent Types for Security.* Several works have considered the use of dependent types to capture the nature of data-dependent security policies. Zheng and Myers [51,52] proposed the first dependent security type system for dealing with dynamic changes to runtime security labels in the context of Jif [29], a full-fledged IFC-aware compiler for Java programs, where similar to our work, operations on labels are modeled at the level of types. Zhang et al. [50] use dependent types in a similar fashion for the design of a hardware description language for timingsensitive information-flow security.

A number of functional languages have been developed with dependent type systems and used to encode value-dependent information flow properties, e.g. Fine [43]. These approaches require the adoption of entirely new languages and compilers where DepSec is embedded in an already existing language. Morgenstern and Licata [25] encode an authorization and IFC-aware programming language in Agda. However, their encoding does not consider side-effects. Nanevski et al. [30] use dependent types to verify information flow and access control policies in an interactive manner.

Lourenço and Caires [23] introduce the notion of *dependent information-flow types* and propose a *fine-grained* type system; every value and function have an associated security level. Their approach is different to the *coarse-grained* approach taken in our work where only some computations and values have associated security labels. Rajani and Garg [33] show that both approaches are equally expressive for static IFC techniques and Vassena et al. [48] show the same for dynamic IFC techniques.

*Principles for Information Flow.* Bastys et al. [6] put forward a set of informal principles for information flow security definitions and enforcement mechanisms: *attacker-driven security, trust-aware enforcement, separation of policy annotations and code, language-independence, justified abstraction, and permissiveness*.

DepSec follows the principle of trust-aware enforcement, as we make clear the boundary between the trusted and untrusted components in the program. Additionally, the design of our declassification mechanism follows the principle of separation of policy annotations and code. The use of dependent types increases the permissiveness of our enforcement as we discuss throughout the paper. While our approach is not fully language-independent, we posit that the approach may be ported to other programming languages with general-purpose dependent types.

*Declassification Enforcement.* Our hatch builders are reminiscent of downgrading policies of Li and Zdancewic [19]. For example, similar to them, DepSec's declassification policies naturally express the idea of *delimited release* [36] that provides explicit characterization of the declassifying computation. Here, DepSec's policies can express a broad range of policies that can be expressed through predicates, an improvement over simple expression-based enforcement mechanisms for delimited release [4,5,36].

An interesting point in the design of declassification policies is *robust declassification* [49] that demands that untrusted components must not affect information release. *Qualified robustness* [2,28] generalizes this notion by giving untrusted code a limited ability to affect information release through the introduction of an explicit endorsement operation. Our approach is orthogonal to both notions of robustness as the intent is to let the untrusted components declassify information but only under very controlled circumstances while adhering to the security policy.

### 8 Conclusion and Future Work

In this paper, we have presented DepSec – a library for statically enforced information-flow control in Idris. Through several case studies, we have showcased how the DepSec primitives increase the expressiveness of state-of-the-art information-flow control libraries and how DepSec matches the expressiveness of a special-purpose dependent information-flow type system on a key example. Moreover, the library allows programmers to implement policy-parameterized functions that abstract over the security policy while retaining precise security levels.

By taking ideas from the literature and by exploiting dependent types, we have shown powerful means of specifying statically enforced declassification policies related to *what*, *who*, and *when* information is released. Specifically, we have introduced the notion of predicate hatch builders and token hatch builders that rely on the fulfillment of predicates and possession of tokens for declassification. We have also shown how the ST monad [10] can be used to limit hatch usage statically.

Finally, we have discussed the necessary means to show progress-insensitive noninterference in a sequential setting for a dependently typed information-flow control library like DepSec.

*Future Work.* There are several avenues for further work. Integrity is vital in many security policies and is not considered in MAC nor DepSec. It will be interesting to take integrity and the presence of concurrency into the dependently typed setting and consider internal and termination covert channels as well. It also remains to prove our declassification mechanisms sound. Here, attackercentric epistemic security conditions [3,16] that intuitively express many declassification policies may be a good starting point.

Acknowledgements. Thanks are due to Mathias Vorreiter Pedersen, Bas Spitters, Alejandro Russo, and Marco Vassena for their valuable insights and the anonymous reviewers for their comments on this paper. This work is partially supported by DFF project 6108-00363 from The Danish Council for Independent Research for the Natural Sciences (FNU), Aarhus University Research Foundation, and the Concordium Blockchain Research Center, Aarhus University, Denmark.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Achieving Safety Incrementally with Checked C**

Andrew Ruef1(B) , Leonidas Lampropoulos1,2, Ian Sweet<sup>1</sup>, David Tarditi<sup>3</sup>, and Michael Hicks<sup>1</sup>

> <sup>1</sup> University of Maryland, College Park, USA {awruef,llampro,ins,mwh}@cs.umd.edu <sup>2</sup> University of Pennsylvania, Philadelphia, USA <sup>3</sup> Microsoft Research, Kirkland, USA dtarditi@microsoft.com

**Abstract.** Checked C is a new effort working toward a memory-safe C. Its design is distinguished from that of prior efforts by truly being an *extension* of C: Every C program is also a Checked C program. Thus, one may make incremental safety improvements to existing codebases while retaining backward compatibility. This paper makes two contributions. First, to help developers convert existing C code to use so-called *checked* (i.e., safe) pointers, we have developed a preliminary, automated porting tool. Notably, this tool takes advantage of the flexibility of Checked C's design: The tool need not perfectly classify every pointer, as required of prior all-or-nothing efforts. Rather, it can make a best effort to convert more pointers accurately, without letting inaccuracies inhibit compilation. However, such partial conversion raises the question: If safety violations can still occur, what sort of advantage does using Checked C provide? We draw inspiration from research on migratory typing to make our second contribution: We prove a *blame* property that renders so-called *checked regions* blameless of any run-time failure. We formalize this property for a core calculus and mechanize the proof in Coq.

### **1 Introduction**

Vulnerabilities that compromise *memory safety* are at the heart of many attacks. *Spatial safety*, one aspect of memory safety, is ensured when any pointer dereference is always within the memory allocated to that pointer. *Buffer overruns* violate spatial safety, and still constitute a common cause of vulnerability. During 2012–2018, buffer overruns were the source of 9.7% to 18.4% of CVEs reported in the NIST vulnerability database [27], constituting the leading single cause of CVEs.

The source of memory unsafety starts with the language definitions of C and C++, which render out-of-bounds pointer dereferences "undefined." Traditional compilers assume they never happen. Many efforts over the last 20 years have aimed for greater assurance by proving that accesses are in bounds, and/or preventing out-of-bounds accesses from happening via inserted dynamic checks [1– 10,12,15,16,18,22,25,26,29]. This paper focuses on *Checked C*, a new, freely available<sup>1</sup> language design for a memory-safe C [11], currently focused on spatial safety. Checked C draws substantial inspiration from prior safe-C efforts but differs in two key ways, both of which focus on backward compatibility with, and incremental improvement of, regular C code.

*Mixing Checked and Legacy Pointers.* First, as outlined in Sect. 2, Checked C permits intermixing checked (safe) pointers and legacy pointers. The former come in three varieties: pointers to single objects Ptr*<*τ*>*; pointers to arrays Array ptr*<*τ*>*, and NUL-terminated arrays Nt array ptr *<*τ*>*. The latter two have an associated clause that describes their known length in terms of constants and other program variables. The specified length is used to either prove pointer dereferences are safe or, barring that, serves as the basis of dynamic checks inserted by the compiler.

Importantly, checked pointers are represented as in normal C—no changes to pointer structure (e.g., by "fattening" a pointer to include its bounds) are imposed. As such, interoperation with legacy C is eased. Moreover, the fact that checked and legacy pointers can be intermixed in the same module eases the porting process, including porting via automated tools. For example, CCured [26] works by automatically classifying existing pointers and compiling them for safety. This classification is necessarily conservative. For example, if a function f (p) is mostly called with safe pointers, but once with an unsafe one (e.g., a "wild" pointer in CCured parlance, perhaps constructed from an **int**), then the classification of p as unsafe will propagate backwards, poisoning the classification of the safe pointers, too. The programmer will be forced to change the code and/or pay a higher cost for added (but unnecessary) run-time checks.

On the other hand, in the Checked C setting, if a function uses a pointer safely then its parameter can be typed that way. It is then up to a caller whose pointer arguments cannot also be made safe to insert a local cast. Section 5 presents a preliminary, whole-program analysis called *checked-c-convert* that utilizes the extra flexibility afforded by mixing pointers to partially convert a C program to a Checked C program. On a benchmark suite of five programs totaling more than 200K LoC, we find that thousands of pointer locations are made more precise than would have been if using a more conservative algorithm like that of CCured. The *checked-c-convert* tool is distributed with the publicly available Checked C codebase.

*Avoiding Blame with Checked Regions.* An important question is what "safety" means in a program with a mix of checked and unchecked pointers. In such a program, safety violations are still possible. How, then, does one assess that a program is safer due to checking some, but not all, of its pointers? Providing a formal answer to this question constitutes the core contribution of this paper.

Unlike past safe-C efforts, Checked C specifically distinguishes parts of the program that are and may not be fully "safe." So-called *checked regions* differ from unchecked ones in that they can *only* use checked pointers—dereference

<sup>1</sup> https://github.com/Microsoft/checkedc.

or creation of unchecked pointers, unsafe casts, and other potentially dangerous constructs are disallowed. Using a core calculus for Checked C programs called CoreChkC, defined in Sect. 3, we prove in Sect. 4 these restrictions are sufficient to ensure that *checked code cannot be blamed.* That is, checked code is internally safe, and any run-time failure can be attributed to unchecked code, even if that failure occurs in a checked region. This proof has been fully mechanized in the Coq proof assistant.<sup>2</sup> Our theorem fills a gap in the literature on *migratory typing* for languages that, like Checked C, use an *erasure* semantics, meaning that no extra dynamic checks are inserted at checked/unchecked code boundaries [14]. Moreover, our approach is lighter weight than the more sophisticated techniques used by the RustBelt project [17], and constitutes a simpler first step toward a safe, mixed-language design. We say more in Sect. 6.

### **2 Overview of Checked C**

We begin by describing the approach to using Checked C and presenting a brief overview of the language extensions, using the example in Fig. 1. For more about the language see Elliott et al. [11]. The approach works as follows:


The programmers repeat steps 3–5 until as much code as possible (ideally, the entire program) has been made safe.

*Checked Pointers.* As mentioned in the introduction, Checked C supports three varieties of *checked* (safe) pointers: pointers to single objects Ptr*<*τ*>*; pointers to arrays Array ptr*<*τ*>*, and NUL-terminated arrays Nt array ptr *<*τ*>*. The dat field of **struct** buf, defined in Fig. 1(b), is an Array ptr*<***char***>*; its length is specified by sz field in the same **struct**, as indicated by the count annotation. Nt array ptr *<*τ*>* types are similar. The q argument of the alloc buf

<sup>2</sup> https://github.com/plum-umd/checkedc/tree/master/coq.

```
1 void copy(
2 char∗ dst : byte count(n),
3 const char∗ src : byte count(n),
4 size t n);
  (a) copy prototype
1 struct buf
2 {
3 Array ptr<char> dat
4 : count(sz−1);
5 unsigned int len ;/∗ len≤ sz ∗/
6 unsigned int sz ;
7 };
  (b) Type definition
                                 1 static char region [MAX]; // unchecked
                                 2 static unsigned int idx = 0;
                                 3
                                 4 Checked void alloc buf(
                                 5 Ptr<struct buf> q,
                                 6 Array ptr<const char> src : count(len),
                                 7 unsigned int len)
                                 8 {
                                 9 if ( len > q→sz) {
                                10 if (idx < MAX && len ≤ MAX − idx) {
                                11 Unchecked {
                                12 q→dat = &region[idx];
                                13 q→sz = len;
                                14 }
                                15 idx += len;
                                16 } else {
                                17 bug("out of region memory");
                                18 }
                                19 }
                                20 copy(q→buf, src , len);
                                21 q→len = len;
                                22 }
                                    (c) Code with checked and unchecked pointers
```
**Fig. 1.** Example Checked C code (slightly simplified for readability)

function in Fig. 1(c) is Ptr*<***struct** buf*>*. This function overwrites the contents of q with those in the second argument src, an array whose length is specified by the third argument, len. Variables with checked pointer types or containing checked pointers must be initialized when they are declared.

*Checked Arrays.* Checked C also supports a checked array type, which is designated by prefixing the dimension of an array declaration with the keyword **Checked**. For example, **int** arr **Checked**[5] declares a 5-element integer array where accesses are always bounds checked. A checked array of τ implicitly converts to an Array ptr*<*τ*>* when accessing it. In our example, the array region has an unchecked array type because the **Checked** keyword is omitted.

*Checked and Unchecked Regions.* Returning to alloc buf : If q→ dat is too small (len *>* q→sz) to hold the contents of src, the function allocates a block from the static region array, whose free area starts at index idx. Designating a checked Array ptr*<***char***>* from a pointer into the middle of the (unchecked) region array is not allowed in checked code, so it must be done within the designated **Unchecked** block. Within such blocks the programmer has the full freedom of C, along with the ability to create and use checked pointers. Checked code, as designated by the **Checked** annotation (e.g., as on the alloc buf function or on a block nested within unchecked code) may not use unchecked pointers or arrays. It also may not define or call functions without prototypes and variable argument functions.

*Interface Types.* Once alloc buf has allocated q→dat it calls copy to transfer the data into it, from src. Checked C permits normal C functions, such as those in an existing library, to be given an *interface type*. This is the type that Checked C code should use in a checked region. In an unchecked region, either the original type *or* the interface type may be used. This allows the function to be called with unchecked types or checked types. For copy, this type is shown in Fig. 1(a).

Interface types can also be attached to definitions within a Checked C file, not just prototypes declared for external libraries. Doing so permits the same function to be called from an unchecked region (with either checked or unchecked types) or a checked region (there it will always have the checked type). For example, if we wanted alloc buf to be callable from unchecked code with unchecked pointers, we could define its prototype as

```
1 void alloc buf (
2 struct buf ∗q : itype ( Ptr<struct buf>),
3 const char ∗src : itype ( Array ptr<const char>) count(len),
4 unsigned int len);
```
*Implementation Details.* Checked C is implemented as an extension to the Clang/ LLVM compiler.<sup>3</sup> The clang front-end inserts run-time checks for the evaluation of lvalue expressions whose results are derived from checked pointers and that will be used to access memory. Accessing a Ptr*<*τ*>* requires a null check, while accessing an Array ptr*<*τ*>* requires both null and bounds checks. The code for these checks is handed to the LLVM backend, which will remove checks if it can prove they will always pass. In general, such checks are the only source of Checked C run-time overhead. Preliminary experiments on some small, pointer-intensive benchmarks show running time overhead to be around 8.6%, on average [11].

### **3 Formalism: CORECHKC**

This section presents a formal language CoreChkC that models the essence of Checked C. The language is designed to be simple but nevertheless highlight Checked C's key features: checked and unchecked pointers, and checked and unchecked code blocks. We prove our key theoretical result—*checked code cannot be blamed* for a spatial safety violation—in the next section.

#### **3.1 Syntax**

The syntax of CoreChkC is presented in Fig. 2. Types τ classify wordsized objects while types ω also include multi-word objects. The type ptr<sup>m</sup>ω types a pointer, where m identifies its *mode*: mode c identifies a Checked C safe pointer, while mode u represents an unchecked pointer. In other words ptr<sup>c</sup>τ is a checked pointer type Ptr*<*τ*<sup>&</sup>gt;* while ptr<sup>u</sup><sup>τ</sup> is an unchecked pointer type <sup>τ</sup>∗.

<sup>3</sup> https://github.com/Microsoft/checkedc-clang.

```
Mode m ::= c | u
Word types τ ::= int | ptrmω
Types ω ::= τ | struct T | array n τ
Expressions e ::= nτ | x | let x = e1 in e2 | malloc@ω | (τ )e
               | e1 + e2 | &e→f | ∗e | ∗e1 = e2 | unchecked e
Structdefs D ∈ T  fs
Fields fs ::= τ f | τ f; fs
```
**Fig. 2.** CoreChkC Syntax

Multiword types ω include struct records, and arrays of type τ having size n, i.e., ptr<sup>c</sup>array n τ represents a checked array pointer type Array ptr*<*τ*<sup>&</sup>gt;* with bounds n. We assume structs are defined separately in a map D from struct names to their constituent field definitions.

Programs are represented as expressions e; we have no separate class of program statements, for simplicity. Expressions include (unsigned) integers n<sup>τ</sup> and local variables x. Constant integers n are annotated with type τ to indicate their intended type. As in an actual implementation, pointers in our formalism are represented as integers. Annotations help formalize type checking and the safety property it provides; they have no effect on the semantics except when τ is a checked pointer, in which case they facilitate null and bounds checks. Variables x, introduced by let-bindings let x = e<sup>1</sup> in e2, can only hold word-sized objects, so all structs can only be accessed by pointers.

Checked pointers are constructed using malloc@ω, where ω is the type (and size) of the allocated memory. Thus, malloc@int produces a pointer of type ptr<sup>c</sup>int while malloc@(array 10 int) produces one of type ptr<sup>c</sup>(array 10 int). Unchecked pointers can only be produced by the cast operator, (τ )e, e.g., by doing (ptr<sup>u</sup>int)malloc@int. Casts can also be used to coerce between integer and pointer types and between different multi-word types.

Pointers are read via the ∗ operator, and assigned to via the = operator. To read or write struct fields, a program can take the address of that field and read or write that address, e.g., <sup>x</sup>→<sup>f</sup> is equivalent to <sup>∗</sup>(&x→f). To read or write an array, the programmer can use pointer arithmetic to access the desired element, e.g., <sup>x</sup>[i] is equivalent to <sup>∗</sup>(<sup>x</sup> <sup>+</sup> <sup>i</sup>).

By default, CoreChkC expressions are assumed to be checked. Expression e in unchecked e is unchecked, giving it additional freedom: Checked pointers may be created via casts, and unchecked pointers may be read or written.

*Design Notes.* CoreChkC leaves out many interesting C language features. We do not include an operation for freeing memory, since this paper is concerned about spatial safety, not temporal safety. CoreChkC models statically sized arrays but supports dynamic indexes; supporting dynamic sizes is interesting but not meaningful enough to justify the complexity it would add to the formalism. Making ints unsigned simplifies handling pointer arithmetic. We do not model

```
Heap H ∈ Z  Z × τ
Result r ::= e | Null | Bounds
Contexts E ::= | let x = E in e | E + e | n + E
            | &E→f | (τ )E | ∗E | ∗E = e | ∗n = E | unchecked E
```
**Fig. 3.** Semantics definitions

control operators or function calls, whose addition would be straightforward.<sup>4</sup> CoreChkC does not have a checked e expression for nesting within unchecked expressions, but supporting it would be easy.

### **3.2 Semantics**

Figure 4 defines the small-step operational semantics for CoreChkC expressions in the form of judgment <sup>H</sup>; <sup>e</sup> −→<sup>m</sup> <sup>H</sup>; <sup>r</sup>. Here, <sup>H</sup> is a *heap*, which is a partial map from integers (representing pointer addresses) to type-annotated integers n<sup>τ</sup> . Annotation m is the *mode* of evaluation, which is either c for checked mode or u for unchecked mode. Finally, r is a *result*, which is either an expression e, Null (indicating a null pointer dereference), or Bounds (indicating an out-ofbounds array access). An unsafe program execution occurs when the expression reaches a *stuck* state—the program is not an integer n<sup>τ</sup> , and yet no rule applies. Notably, this could happen if trying to dereference a pointer n that is actually invalid, i.e., H(n) is undefined.

The semantics is defined in the standard manner using *evaluation contexts* E. We write E[e0] to mean the expression that results from substituting e<sup>0</sup> into the "hole" ( ) of context E. Rule C-Exp defines normal evaluation. It decomposes an expression e into a context E and expression e<sup>0</sup> and then evaluates the latter via H; e<sup>0</sup> - H ; e <sup>0</sup>, discussed below. The evaluation mode m is constrained by the *mode*(E) function, also given in Fig. 4. The rule and this function ensure that when evaluation occurs within e in some expression unchecked e, then it does so in unchecked mode u; otherwise it may be in checked mode c. Rule C-Halt halts evaluation due to a failed null or bounds check.

The rules prefixed with E- are those of the computation semantics H; e<sup>0</sup> - H ; e <sup>0</sup>. The semantics is implicitly parameterized by struct map D. The rest of this section provides additional details for each rule, followed by a discussion of CoreChkC's type system.

Rule E-Binop produces an integer n<sup>3</sup> that is the sum of arguments n<sup>1</sup> and n2. As mentioned earlier, the annotations τ on literals n<sup>τ</sup> indicate the type the program has ascribed to n. When a type annotation is not a checked pointer, the semantics ignores it. In the particular case of E-Binop for example, addition

<sup>4</sup> Function calls *f*(*e*- ) can be modeled by let *x* = *e*<sup>1</sup> in *e*2, where we can view *x* as function *f*'s parameter, *e*<sup>2</sup> as its body, and *e*<sup>1</sup> as its actual argument. Calls to unchecked functions from checked code can thus be simulated by having an unchecked *e* expression for *e*2.


**Fig. 4.** Operational semantics

n<sup>τ</sup><sup>1</sup> <sup>1</sup> <sup>+</sup>n<sup>τ</sup><sup>2</sup> <sup>2</sup> ignores τ<sup>1</sup> and τ<sup>2</sup> when τ<sup>1</sup> is not a checked pointer, and simply annotates the result with it. However, when τ is a checked pointer, the rules use it to model bounds checks; in particular, dereferencing n<sup>τ</sup> where τ is ptr<sup>c</sup>(array l τ0) produces Bounds when l = 0 (more below). As such, when n<sup>1</sup> is a non-zero, checked pointer to an array and n<sup>2</sup> is an int, result n<sup>3</sup> is annotated as a pointer to an array with its bounds suitably updated.<sup>5</sup> Checked pointer arithmetic on 0 is disallowed; see below.

Rules E-Deref and E-Assign confirm the bounds of checked array pointers: the length l must be positive for the dereference to be legal. The rule permits the program to proceed for non-checked or non-array pointers (but the type system will forbid them).

Rule E-Amper takes the address of a struct field, according to the type annotation on the pointer, as long the pointer is not zero or not checked.

Rule E-Malloc allocates a checked pointer by finding a string of free heap locations and initializing each to 0, annotated to the appropriate type. Here, types(D, ω) returns k types, where these are the types of the corresponding memory words; e.g., if ω is a struct then these are the types of its fields (looked up in D), while if ω is an array of length k containing values of type τ , then we will get back k τ 's. We require <sup>k</sup> = 0 or the program is stuck (a situation precluded by the type system).

Rule E-Let uses a substitution semantics for local variables; notation <sup>e</sup>[<sup>x</sup> → n<sup>τ</sup> ] means that all occurrences of x in e should be replaced with n<sup>τ</sup> .

Rule E-Unchecked returns the result of an unchecked block.

Rules with prefix X- describe failures due to bounds checks and null checks on checked pointers. These are analogues to the E-Assign, E-Deref, E-Binop, and E-Amper cases. The first two rules indicate a bounds violation for size-zero array pointers. The next two indicate an attempt to dereference a null pointer. The last two indicate an attempt to construct a checked pointer from a null pointer via field access or pointer arithmetic.

#### **3.3 Typing**

The typing judgment <sup>Γ</sup>; <sup>σ</sup> <sup>m</sup> <sup>e</sup> : <sup>τ</sup> says that expression <sup>e</sup> has type <sup>τ</sup> under environment Γ and scope σ when in mode m. A scope σ is an additional environment consisting of a set of literals; it is used to type cyclic structures (in Rule T-PtrC, below) that may arise during program evaluation. The heap H and struct map D are implicit parameters of the judgment; they do not appear because they are invariant in derivations. unchecked expressions are typed in mode u; otherwise we may use either mode.

Γ maps variables x to types τ , and is used in rules T-Var and T-Let as usual. Rule T-Base ascribes type τ to literal n<sup>τ</sup> . This is safe when τ is int (always). If τ is an unchecked pointer type, a dereference is only allowed by the type system to be in unchecked code (see below), and as such any sort of failure (including a stuck program) is not a safety violation. When n is 0 then τ can be anything, including a checked pointer type, because dereferencing n would (safely) produce Null. Finally, if τ is ptr<sup>c</sup>(array 0 τ ) then dereferencing n would (safely) produce Bounds.

<sup>5</sup> Here, *<sup>l</sup>* <sup>−</sup>*n*<sup>2</sup> is natural number arithmetic: if *<sup>n</sup>*<sup>2</sup> *> l* then *<sup>l</sup>* <sup>−</sup>*n*<sup>2</sup> = 0. This would have to be adjusted if the language contained subtraction, or else bounds information would be unsound.


#### **Fig. 5.** Typing

Rule T-PtrC is perhaps the most interesting rule of CoreChkC. It ensures checked pointers of type ptr<sup>c</sup>ω are consistent with the heap, by confirming the pointed-to heap memory has types consistent with ω, recursively. When doing this, we extend σ with n<sup>τ</sup> to properly handle cyclic heap structures; σ is used by RuleT-VConst.

To make things more concrete, consider the following program that constructs a cyclic cons cell, using a standard single-linked list representation:

```
D(node) = int val; ptrc struct node
let p = malloc@struct node in ∗(&p→next) = p
```
After executing the program above, the heap would look something like the following, where n is the integer value of p. That is, the n-th location of the heap contains 0 (the default value for field *val* picked by malloc), while the (n + 1)-th location, which corresponds to field *next*, contains the literal n.

How can we type the pointer nptr*c*struct node in this heap without getting an infinite typing judgment?

$$F; \sigma \vdash\_c n^{\mathbf{ptr}^c \texttt{struct} \ node}: \texttt{ptr}^c \texttt{struct} \ node$$

That's where the scope comes in, to break the recursion. In particular, using Rule T-PtrC and struct node's definition, we would need to prove two things:

$$\begin{aligned} \Gamma; \sigma, n^{\texttt{ptr}^c \texttt{struct}} \ node &\vdash\_c H(n+0) : \texttt{int} \\ \text{and} \\ \Gamma; \sigma, n^{\texttt{ptr}^c \texttt{struct}} \mathbin{node} &\vdash\_c H(n+1) : \texttt{ptr}^c \texttt{struct} \bmod \texttt{end} \end{aligned}$$

Since H(n + 0) = 0, as malloc zeroes out its memory, we can trivially prove the first goal using Rule T-Base. However, the second goal is almost exactly what we set out to prove in the first place! If not for the presence of the scope σ, the proof the n is typeable would be infinite! However, by adding nptr*c*struct node to the scope, we are essentially assuming it is well-typed to type its contents, and the desired result follows by Rule T-VConst. 6

A key feature of T-PtrC is that it effectively confirms that all pointers reachable from the given one are consistent; it says nothing about other parts of the heap. So, if a set of checked pointers is only reachable via unchecked pointers then we are not concerned whether they are consistent, since they cannot be directly dereferenced by checked code.

Back to the remaining rules, T-Amper and T-BinopInt are unsurprising. Rule T-Malloc produces checked pointers so long as the pointed-to type ω is not zero-sized, i.e., is not array 0 τ . Rule T-Unchecked introduces unchecked mode, relaxing access rules. Rule T-Cast enforces that checked pointers cannot be cast targets in checked mode.

Rules T-Deref and T-Assign type pointer accesses. These rules require unchecked pointers only be dereferenced in unchecked mode. Rule T-Index permits

<sup>6</sup> For readers familiar with coinduction [28], this proof technique is similar: to prove a coinductive property *<sup>P</sup>* one would assume *<sup>P</sup>* but need to use it *productively* in a subterm; similarly here, we can assume a pointer is well-typed when we attempt to type heap locations that are reachable from it.

reading a computed pointer to an array, and rule T-IndAssign permits writing to one. These rules are not strong enough to permit updating a pointer to an array after performing arithmetic on it. In general, Checked C's design permits overcoming such limitations through selective use of casts in unchecked code. (That said, our implementation is more flexible in this particular case.)

### **4 Checked Code Cannot Be Blamed**

Our main formal result is that well-typed programs will never fail with a spatial safety violation that is due to a checked region of code, i.e., *checked code cannot be blamed.* This section presents the main result and outlines its proof. We have mechanized the full proof using the Coq proof assistant. The development is roughly 3500 lines long, including comments. It is freely available at https:// github.com/plum-umd/checkedc/tree/master/coq.

#### **4.1 Progress and Preservation**

The blame theorem is proved using the two standard syntactic type-safety notions of Progress and Preservation, adapted for CoreChkC. Progress indicates that a (closed) well-typed program either is a value, can take a step (in either mode), or else is stuck in unchecked code. A program is in unchecked mode if its expression e only type checks in mode u, or its (unique) context E has mode u.

**Theorem 1 (Progress).** *If* · <sup>m</sup> <sup>e</sup> : <sup>τ</sup> *(under heap* <sup>H</sup>*) then one of the following holds:*


Preservation indicates that if a well-typed program in checked mode takes a checked step then the resulting program is also well-typed in checked mode.

**Theorem 2 (Preservation).** *If* <sup>Γ</sup>; · <sup>c</sup> <sup>e</sup> : <sup>τ</sup> *(under a heap* <sup>H</sup>*) and* <sup>H</sup>; <sup>e</sup> −→<sup>c</sup> H ; r *(for some* H , r*), then and* <sup>r</sup> <sup>=</sup> <sup>e</sup> *implies* <sup>H</sup> <sup>H</sup> *and* <sup>Γ</sup>; · <sup>c</sup> <sup>e</sup> : <sup>τ</sup> *(under heap* H *).*

We write <sup>H</sup> <sup>H</sup> to mean that for all <sup>n</sup><sup>τ</sup> if · <sup>c</sup> <sup>n</sup><sup>τ</sup> : <sup>τ</sup> under <sup>H</sup> then · <sup>c</sup> <sup>n</sup><sup>τ</sup> : <sup>τ</sup> under <sup>H</sup> as well.

The proofs of both theorems are by induction on the typing derivation. The Preservation proof is the most delicate, particularly ensuring H H despite the creation or modification of cyclic data structures. Crucial to the proof were two lemmas dealing with the scope, *weakening* and *strengthening*.

The first lemma, scope weakening, allows us to arbitrarily extend a scope with any literal n<sup>τ</sup><sup>0</sup> 0 .

**Lemma 1 (Weakening).** *If* <sup>Γ</sup>; <sup>σ</sup> <sup>m</sup> <sup>n</sup><sup>τ</sup> : <sup>τ</sup> *then* <sup>Γ</sup>; σ, nτ<sup>0</sup> <sup>0</sup> <sup>m</sup> <sup>n</sup><sup>τ</sup> : <sup>τ</sup> *, for all* nτ<sup>0</sup> 0 *.*

Intuitively, this lemma holds because if a proof of <sup>Γ</sup>; <sup>σ</sup> <sup>m</sup> <sup>n</sup><sup>τ</sup> : <sup>τ</sup> relies on the rule T-VConst, then that nτ<sup>1</sup> <sup>1</sup> <sup>∈</sup> <sup>σ</sup> for some <sup>n</sup>τ<sup>1</sup> <sup>1</sup> . But then <sup>n</sup>τ<sup>1</sup> <sup>1</sup> <sup>∈</sup> (σ, nτ<sup>0</sup> <sup>0</sup> ) as well. Importantly, the scope σ is a *set* of n<sup>τ</sup> and not a map from n to τ . As such, if nτ- is already present in σ, adding nτ- <sup>0</sup> will not clobber it. Allowing the same literal to have multiple types is of practical importance. For example a pointer n to a struct could be annotated with the type of the struct, or the type of the first field of the struct, or int; all may safely appear in the environment.

Consider the proof that nptr*c*struct node is well typed for the heap given in Sect. 3.3. After applying Rule T-PtrC, we used the fact that <sup>n</sup>ptr*c*struct node <sup>∈</sup> σ, nptr*c*struct node to prove that the *next* field of the struct is well typed. If we were to replace σ with another scope σ, n<sup>τ</sup><sup>0</sup> <sup>0</sup> for some typed literal <sup>n</sup><sup>τ</sup><sup>0</sup> <sup>0</sup> (and as a result any scope that is a superset of <sup>σ</sup>), the inclusion <sup>n</sup>ptr*c*struct node <sup>∈</sup> σ, n<sup>τ</sup><sup>0</sup> <sup>0</sup> , nptr*c*struct node still holds and our pointer is still well-typed.

Conversely, the second lemma, scope strengthening, allows us to remove a literal from a scope, if that literal is well typed in an empty context.

**Lemma 2 (Strengthening).** *If* <sup>Γ</sup>; <sup>σ</sup> <sup>m</sup> <sup>n</sup><sup>τ</sup><sup>1</sup> <sup>1</sup> : <sup>τ</sup><sup>1</sup> *and* <sup>Γ</sup>; · <sup>m</sup> <sup>n</sup><sup>τ</sup><sup>2</sup> <sup>2</sup> : τ2*, then* <sup>Γ</sup>; <sup>σ</sup>\{n<sup>τ</sup><sup>2</sup> <sup>2</sup> } <sup>m</sup> <sup>n</sup><sup>τ</sup><sup>1</sup> <sup>1</sup> : τ1*.*

Informally, if the fact that n<sup>τ</sup><sup>2</sup> <sup>2</sup> is in the scope is used in the proof of welltypedness of n<sup>τ</sup><sup>1</sup> <sup>1</sup> to prove that <sup>n</sup><sup>τ</sup><sup>2</sup> <sup>2</sup> is well-typed for some scope σ, then we can just use the proof that it is well-typed in an empty scope, along with weakening, to reach the same conclusion.

Looking back again at the proof of the previous section, we know that

$$\begin{array}{c} \Gamma; \vdash\_c n: \texttt{ptr}^c \texttt{struct} \ node\\ \text{and} \\ \Gamma; \sigma, n^{\texttt{ptr}^c \texttt{struct}} \mathbin{node} \vdash\_c \& n \rightarrow \texttt{next}: \texttt{ptr}^c \texttt{struct} \mathbin{node} \end{array}$$

While the proof of the latter fact relies on nptr*c*struct node being in scope, that would not be necessary if we knew (independently) that it was well-typed. That would essentially amount to unrolling the proof by one step.

#### **4.2 Blame**

With progress and preservation we can prove a *blame theorem*: Only unchecked code can be blamed as the ultimate reason for a stuck program.

**Theorem 3 (Checked code cannot be blamed).** *Suppose* · <sup>c</sup> <sup>e</sup> : <sup>τ</sup> *(under heap* <sup>H</sup>*) and there exists* <sup>H</sup>i*,* <sup>m</sup>i*, and* <sup>e</sup><sup>i</sup> *for* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>k</sup> *such that* <sup>H</sup>; <sup>e</sup> −→<sup>m</sup><sup>1</sup> <sup>H</sup>1; <sup>e</sup><sup>1</sup> −→<sup>m</sup><sup>2</sup> ... −→<sup>m</sup>*<sup>k</sup>* <sup>H</sup>k; <sup>e</sup>k*. If* <sup>H</sup>k; <sup>e</sup><sup>k</sup> *is stuck then the source of the issue is unchecked code.*

*Proof.* Suppose · <sup>c</sup> <sup>e</sup><sup>k</sup> : <sup>τ</sup> (under heap <sup>H</sup>k). By Progress, the only way the Hk; e<sup>k</sup> can be stuck is if e<sup>k</sup> = E[e] and *mode*(E) = u; i.e., the term's redex is in unchecked code. Otherwise <sup>H</sup>k; <sup>e</sup><sup>k</sup> is not well typed, i.e., · <sup>c</sup> <sup>e</sup><sup>k</sup> : <sup>τ</sup> (under heap Hk). As such, one of the steps of the evaluation was in unchecked code, i.e., there must exist some <sup>i</sup> where 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>k</sup> and <sup>m</sup><sup>i</sup> <sup>=</sup> <sup>u</sup>. This is because, by Preservation, a well-typed program in checked mode that takes a checked step always leads to a well-typed program in checked mode.

This theorem means that a code reviewer can focus on unchecked code regions, trusting that checked ones are safe.

### **5 Porting Assistance**

Porting legacy code to use Checked C's features can be tedious and time consuming. To assist the process, we developed a source-to-source translator called *checked-c-convert* that discovers some safely-used pointers and rewrites them to be checked. This algorithm is based on one used by CCured [26], but exploits Checked C's allowance of mixing checked and unchecked pointers to make less conservative decisions.

The *checked-c-convert* translator works by (1) traversing a program's abstract syntax tree (AST) to generate constraints based on pointer variable declaration and use; (2) solving those constraints; and (3) rewriting the program. These rewrites consist of promoting some declared pointer types to be *checked*, some parameter types to be *bounds-safe interfaces*, and inserting some casts. *checked-c-convert* aims to produce a well-formed Checked C program whose changes from the original are minimal and unsurprising. A particular challenge is to preserve syntactic structure of the program. A rewritten program should be recognizable by the author and it should be usable as a starting point for both the development of new features and additional porting. The *checked-c-convert* tool is implemented as a clang libtooling application and is freely available.

#### **5.1 Constraint Logic and Solving**

The basic approach is to infer a *qualifier* q<sup>i</sup> for each defined pointer variable i. Inspired by CCured's approach [26], qualifiers can be either *PTR*, *ARR* and *UNK*, ordered as a lattice *PTR* < *ARR* < *UNK*. Those variables with inferred qualifier *PTR* can be rewritten into Ptr*<*τ*>* types, while those with *UNK* are left as is. Those with the *ARR* qualifier are eligible to have Array ptr*<*τ*>* type. For the moment we only signal this fact in a comment and do not rewrite because we cannot always infer proper bounds expressions.

Qualifiers are introduced at each pointer variable declaration, i.e., parameter, variable, field, etc. Constraints are introduced as a pointer is used, and take one of the following forms:

$$\begin{array}{ll} q\_i = PTP & q\_i \neq PTR\\ q\_i = ARR & q\_i \neq ARR\\ q\_i = UNK & q\_i \neq UNK\\ q\_i = q\_j & q\_i = ARR \Rightarrow q\_j = ARR\\ q\_i = UNK \Rightarrow q\_j = UNK \end{array}$$

An expression that performs arithmetic on a pointer with qualifier qi, either via + or [], introduces a constraint q<sup>i</sup> = *ARR*. Assignments between pointers introduce aliasing constraints of the form q<sup>i</sup> = q<sup>j</sup> . Casts introduce implication constraints based on the relationship between the sizes of the two types. If the sizes are not comparable, then both constraint variables in an assignment-based cast are constrained to *UNK* via an equality constraint. One difference from CCured is the use of negation constraints, which are used to fix a constraint variable to a particular Checked C type (e.g., due to an existing Ptr*<*τ*>* annotation). These would cause problems for CCured, as they might introduce unresolvable conflicts. But Checked C's allowance of checked and unchecked code can resolve them using explicit casts and bounds-safe interfaces, as discussed below.

One problem with unification-based analysis is that a single unsafe use might "pollute" the constraint system by introducing an equality constraint to *UNK* that transitively constrains unified qualifiers to *UNK* as well. For example, casting a **struct** pointer to a **unsigned char** buffer to write to the network would cause all transitive uses of that pointer to be unchecked. The tool takes advantage of Checked C's ability to mix checked and unchecked pointers to solve this problem. In particular, constraints for each function are solved locally, using separate qualifier variables for each external function's declared parameters.

#### **5.2 Algorithm**

Our modular algorithm runs as follows:

	- *Declaration*: There may be multiple declarations. The constraint variables for the parameters and return values in the declarations are all constrained to be equal to each other. At call sites, the constraint variables used for a function's parameters and return values come from those in the declaration, not the definition (unless there is no declaration).

#### **5.3 Resolving Conflicts**

Defining distinct constraint variables for function declarations, used at call-sites, and function definitions, used within that function, can result in conflicting solutions. If there is a conflict, then the declaration's solution is safer than the definition, or the definition's is safer than the declaration's. Which case we are in can be determined by considering the relationship between the variables' valuations in the qualifier lattice. There are three cases:


*Example: caller is safer than callee:* Consider a function that makes unsafe use of the parameter within the body of the function, but a callee of the function passes an argument that is only ever used safely.

```
1 void f ( int ∗a) {
2 ∗(int ∗∗)a = a;
3 }
4
5 void caller (void) {
6 int q = 0;
7 int ∗p = &q;
8 f (p);
9 }
```
Here, we cannot make a safe since its use is outside Checked C's type system. Relying on a unification-only approach, this fact would poison all arguments passed to f too, i.e., p in caller. This is unfortunate, since p is used safely inside of caller. Our algorithm remedies this situation by doing the conversion and inserting a cast:

```
1
2 void caller (void) {
3 int q = 0;
4 Ptr<int> p = &q;
5 f (( int∗)p);
6 }
```
The presence of the cast indicates to the programmer that perhaps there is something in f that should be investigated.

*Example: caller less safe than callee:* Now consider a function that makes safe use of the parameter within the body of the function, but a caller of the function might perform casts or other unsafe operations on an argument it passes.

```
1 void f ( int ∗a) {
2 ∗a = 0;
3 }
4
5 void caller (void) {
6 int q = 0;
7 f1(&q);
8 f1 ((( int∗) 0x8f8000));
9 }
```
If considered in isolation, the function f is safe and the parameter could be rewritten to Ptr*<***int***>*. However, it is used from an unsafe context. In an approach with pure unification, like CCured, this unsafe use at the call-site would pollute the classification at the definition. Our algorithm considers solutions and call sites and definitions independently. Here, the uses of f in caller are less safe than those in the f's definition so the rewriter would insert a bounds-safe interface for f:

```
1 void f ( int ∗a : itype ( Ptr<int>)) {
2 ∗a = 0;
3 }
```
The itype syntax indicates that a can be supplied by the caller as either an **int**<sup>∗</sup> or a Ptr*<*τ*>*, but the function body will treat <sup>a</sup> as a Ptr*<*τ*>*. (See Sect. <sup>2</sup> for more on interface types.)

This approach has advantages and disadvantages. It favors making the fewest number of modifications across a project. An alternative to using interface types would be to change the parameter type to a Ptr*<*τ*>* directly, and then insert casts at each call site. This would tell the programmer where potentially bogus pointer values were, but would also increase the number of changes made. Our approach does not immediately tell the programmer where the pointer changes need to be made. However, the Checked C compiler will do that if the programmer takes a bounds-safe interface and manually converts it into a non-interface Ptr*<*τ*>* type. Every location that would require a cast will fail to type check, signaling to the programmer to have a closer look.


**Table 1.** Number of pointer declarations converted through automated porting

#### **5.4 Experimental Evaluation**

We carried out a preliminary experimental evaluation of the efficacy of *checkedc-convert*. To do so, we ran it on five targets—programs and libraries—and recorded how many pointer types the rewriter converted and how many casts were inserted. We chose these targets as they constitute legacy code used in commodity systems, and in security-sensitive contexts.

Running *checked-c-convert* took no more than 30 min to run, for each target. Table 1 contains the results. The first and last column indicate the target, its version, and the lines of code it contains (per cloc). The second column (# **of** \*) counts the number of pointer definitions or declarations in the program, i.e., places that might get rewritten when porting. The next three columns (% **Ptr**, **Arr.**, **Unk.**) indicate the percentages of these that were determined to be PTR, ARR, or UNK, respectively, where only those in % **Ptr** induce a rewriting action. The results show that a fair number of variables can be automatically rewritten as safe, single pointers ( Ptr*<*τ*>*). After investigation, there are usually two reasons that a pointer cannot be replaced with a Ptr*<*τ*>*: either some arithmetic is performed on the pointer, or it is passed as a parameter to a library function for which a bounds-safe interface does not exist.

The next two columns (**Casts(Calls)**, **Ifcs(Funcs)**) examine how our rewriting algorithm takes advantage of Checked C's support for incremental conversion. In particular, column 6 (**Casts(Calls)**) counts how many times we cast a safe pointer at the call site of a function deemed to use that pointer unsafely; in parentheses we indicate the total number of call sites in the program. Column 7 (**Ifcs(Funcs)**) counts how often a function definition or declaration has its type rewritten to use an interface type, where the total declaration/definition count is in parentheses. This rewriting occurs when the function itself uses at least one of its parameters safely, but at least one caller provides an argument that is deemed unsafe. Both columns together represent an improvement in precision, compared to unification-only, due to Checked C's focus on backward compatibility.

This experiment represents the first step a developer would take to adopting Checked C into their project. The values converted into Ptr*<*τ*>* by the re-writer need never be considered again during the rest of the conversion or by subsequent software assurance/bug finding efforts.

### **6 Related Work**

There has been substantial prior work that aims to address the vulnerability presented by C's lack of memory safety. A detailed discussion of how this work compares to Checked C can be found in Elliott et al. [11]. Here we discuss approaches for automating C safety, as that is most related to work on our rewriting algorithm. We also discuss prior work generally on *migratory typing*, which aims to support backward compatible migration of an untyped/less-typed program to a statically typed one.

*Security Mitigations.* The lack of memory safety in C and C++ has serious practical consequences, especially for security, so there has been extensive research toward addressing it automatically. One approach is to attempt to detect memory corruption after it has happened or prevent an attacker from exploiting a memory vulnerability. Approaches deployed in practice include stack canaries [31], address space layout randomization (ASLR) [34], data-execution prevention (DEP), and control-flow integrity (CFI) [1]. These defenses have led to an escalating series of measures and counter-measures by attackers and defenders [32]. These approaches do not prevent data modification or data disclosure attacks, and they can be defeated by determined attackers who use those attacks. By contrast, enforcing memory safety avoids these issues.

*Memory-Safe C.* Another important line of prior work aims to enforce memory safety for C; here we focus on projects that aim to do so (mostly) automatically in a way related to our rewriting algorithm. CCured [26] is a source-to-source rewriter that transforms C programs to be safe automatically. CCured's goal is end-to-end soundness for the entire program. It uses a whole-program analysis that divides pointers into fat pointers (which allow pointer arithmetic and unsafe casts) and thin pointers (which do not). The use of fat pointers causes problems interoperating with existing libraries and systems, making the CCured approach impractical when that is necessary. Other systems attempt to overcome the limitations of fat pointers by storing the bounds information in a separate metadata space [24,25] or within unused bits in 64-bit pointers [19] (though this approach is unsound [13]). These approaches can add substantial overhead; e.g., Softbound's overhead for spatial safety checking is 67%. Deputy [38] uses backwardcompatible pointer representations with types similar to those in Checked C. It supports inference local to a function, but resorts to manual annotations at function and module boundaries. None of these systems permit intermixing safe and unsafe pointers within a module, as Checked C does, which means that some code simply needs to be rewritten rather than included but clearly marked within **Unchecked** blocks.

*Migratory Typing.* Checked C is closely related to work supporting migratory typing [35] (aka gradual typing [30]). In that setting, portions of a program written in a dynamically typed language can be annotated with static types. For Checked C, legacy C plays the role of the dynamically typed language and checked regions play the role of statically typed portions. In migratory typing, one typically proves that a fully annotated program is statically type-safe. What about mixed programs? They can be given a semantics that checks static types at boundary crossings [21]. For example, calling a statically typed function from dynamically typed code would induce a dynamic check that the passed-in argument has the specified type. When a function is passed as an argument, this check must be deferred until the function is called. The delay prompted research on proving *blame*: Even if a failure were to occur within static code, it could be blamed on bogus values provided by dynamic code [36]. This semantics is, however, slow [33], so many languages opt for what Greenman and Felleisen [14] term the *erasure semantics*: No checks are added and no notion of blame is proved, i.e., failures in statically typed code are not formally connected to errors in dynamic code. Checked C also has erasure semantics, but Theorem 3 is able to lay blame with the unchecked code.

*Rust.* Rust [20] is a programming language, like C, that supports zero-cost abstractions, but like Checked C, aims to be safe. Rust programs may have designated unsafe blocks in which certain rules are relaxed, potentially allowing run-time failures. As with Checked C, the question is how to reason about the safety of a program that contains any amount of unsafe code. The RustBelt project [17] proposes to use a semantic [23], rather than syntactic [37], account of soundness, in which (1) types are given meaning according to what terms inhabit them; (2) type rules are sound when interpreted semantically; and (3) semantic well typing implies safe execution. With this approach, unsafe code can be (manually) proved to inhabit the semantic interpretation of its type, in which case its use by type-checked code will be safe.

We view our approach as complementary to that of RustBelt, perhaps constituting the first step in mixed-language safety assurance. In particular, we employ a simple, syntactic proof that checked code is safe and unchecked code can always be blamed for a failure—no proof about any particular unsafe code is required. Stronger assurance that programs are safe despite using mixed code could employ the (more involved and labor-intensive) RustBelt approach.

### **7 Conclusions and Future Work**

This paper has presented CoreChkC, a core formalism for Checked C, an extension to C aiming to provide spatial safety. CoreChkC models Checked C's safe (checked) and unsafe (legacy) pointers; while these pointers can be intermixed, use of legacy pointers is severely restricted in *checked regions* of code. We prove that these restrictions are efficacious: *checked code cannot be blamed* in the sense that any spatial safety violation must be directly or indirectly due to an unsafe operation outside a checked region. Our formalization and proof are mechanized in the Coq proof assistant; this mechanization is available at https://github. com/plum-umd/checkedc/tree/master/coq.

The freedom to intermix safe and legacy pointers in Checked C programs affords flexibility when porting legacy code. We show this is true for *automated porting* as well. A whole-program rewriting algorithm we built is able to make more pointers safe than it would if pointer types were all-or-nothing; we do this by taking advantage of Checked C's allowed casts and interface types. The tool implementing this algorithm, *checked-c-convert*, is distributed with Checked C at https://github.com/Microsoft/checkedc-clang.

As future work, we are interested in formalizing other aspects of Checked C, notably its *subsumption algorithm* and support for *flow-sensitive* typing (to handle pointer arithmetic), to prove that these aspects of the implementation are correct. We are also interested in expanding support for the rewriting algorithm, by using more advanced static analysis techniques to infer numeric bounds suitable for re-writing array types. Finally, we hope to automatically infer regions of code that could be enclosed within checked regions.

**Acknowledgments.** We would like to thank the anonymous reviewers for helpful comments on drafts of this paper, and Sam Elliott for contributions to the portions of the design and implementation of Checked C presented in this paper. This research was funded in part by the U.S. National Science Foundation under grants CNS-1801545 and EDU-1319147.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### Wys*-*: A DSL for Verified Secure Multi-party Computations

Aseem Rastogi1(B) , Nikhil Swamy<sup>2</sup>, and Michael Hicks<sup>3</sup>

> <sup>1</sup> Microsoft Research, Bangalore, India aseemr@microsoft.com <sup>2</sup> Microsoft Research, Redmond, USA nswamy@microsoft.com <sup>3</sup> University of Maryland, College Park, USA mwh@cs.umd.edu

Abstract. Secure multi-party computation (MPC) enables a set of mutually distrusting parties to cooperatively compute, using a cryptographic protocol, a function over their private data. This paper presents Wys-, a new domain-specific language (DSL) for writing *mixed-mode* MPCs. Wys is an embedded DSL hosted in F-, a verification-oriented, effectful programming language. Wys source programs are essentially F programs written in a custom MPC effect, meaning that the programmers can use F-'s logic to verify the correctness and security properties of their programs. To reason about the distributed runtime semantics of these programs, we formalize a deep embedding of Wys-, also in F-. We mechanize the necessary metatheory to prove that the properties verified for the Wys source programs carry over to the distributed, multiparty semantics. Finally, we use F-'s extraction to extract an interpreter that we have proved matches this semantics, yielding a partially verified implementation. Wys is the first DSL to enable formal verification of MPC programs. We have implemented several MPC protocols in Wys-, including private set intersection, joint median, and an MPC-based card dealing application, and have verified their correctness and security.

### 1 Introduction

Secure multi-party computation (MPC) enables two or more parties to compute a function f over their private inputs x<sup>i</sup> so that parties don't see each others' inputs, but rather only see the output f(x1, ..., xn). Using a trusted third party to compute f would achieve this goal, but in fact we can achieve it using one of a variety of cryptographic protocols carried out only among the participants [12,26,58,65]. One example use of MPC is private set intersection (PSI): the x<sup>i</sup> could be individuals' personal interests, and the function f computes their intersection, revealing which interests the group has in common, but not any interests that they don't. MPC has also been used for auctions [18], detecting tax fraud [16], managing supply chains [33], privacy preserving statistical analysis [31], and more recently for machine learning tasks [19,21,30,38,44].

Typically, cryptographic protocols expect f to be specified as a boolean or arithmetic circuit. Programming directly with circuits and cryptography is painful, so starting with the Fairplay project [40] many researchers have designed higher-level domain-specific languages (DSLs) for programming MPCs [6,14,17, 19,23,27,29,34,37,39,45,48,49,52,56,61]. These DSLs compile source code to circuits which are then given to the underlying cryptographic protocol. While doing this undoubtedly makes it easier to program MPCs, these languages still have several drawbacks regarding both security and usability.

This paper presents Wys-, a new MPC DSL that addresses several problems in prior DSLs. Unlike most previous MPC DSLs, Wys is not a standalone language, but is rather an embedded DSL hosted in F- [59], a full-featured, verification-oriented, effectful programming language. Wys has the following two distinguishing elements:

*1. A program logic for MPC* (Sects. 2 and 3). In their most general form, MPC applications are *mixed-mode*: they consist of parties performing (potentially different) local, in-clear computations (e.g. I/O, preprocessing inputs) interleaved with joint, secure computations. Wys is the first MPC DSL to provide a program logic to formally reason about the *correctness and security* of such applications, e.g., to prove that the outputs will not reveal too much information about a party's inputs [41].<sup>1</sup>

To avoid reasoning about separate programs for each party, Wys builds on the basic programming model of the Wysteria MPC DSL [52] that allows applications to be written as a single specification. Wys presents a *shallow embedding* of the Wysteria programming model in F-. When writing Wys source programs, programmers essentially write F programs in a new Wys effect, against a library of MPC combinators. The pre- and postcondition specifications on the combinators encode a program logic for MPC. The logic provides *observable traces*—a novel addition to the Wysteria semantics—which programmers can use to specify security properties such as delimited release [55]. Since Wys- programs are F programs, F computes verification conditions (VCs) for them which are discharged using Z3 [2] as usual.

We prove the soundness of the program logic—that the properties proven about the Wys source programs carry over when these programs are run by multiple parties in a distributed manner—also in F-. The proof connects the pre- and postconditions of the Wys combinators to their distributed semantics in two steps. First, we implement the combinators in F-, proving the validity of their pre- and postconditions against their implementation. Next, we reason about this implementation and the distributed runtime semantics through a deep embedding of Wys in F-. Essentially, we deep-embed the Wys combinator abstract syntax trees (ASTs) as an F datatype and formalize two operational semantics for them: a conceptual single-threaded semantics that models their

<sup>1</sup> Our attacker model is the "honest-but-curious" model where the attackers are the participants themselves, who play their roles in the protocol faithfully, but are motivated to infer as much as they can about the other participants' secrets by observing the protocol. Section 2.3 makes the security model of Wysmore precise.

F implementation, and the actual distributed semantics that models the multiparty runs of the programs. We prove, in F-, that the single-threaded semantics is sound with respect to the distributed semantics (Sect. 3). While we use F-, the program logic is general and it should be possible to embed it in other verification frameworks (e.g., in Coq, in the style of Hoare Type Theory [46]).

*2. A full-featured, partially verified implementation* (Sect. 3). Wys-'s implementation is, in part, formally verified. The hope is that formal verification will reduce the occurrence of security threatening bugs, as it has in prior work [15,36,50,63,64].

We define an interpreter in F that operates over the Wys- ASTs produced by a custom F extraction for the Wys effect. While the local computations are executed locally by the interpreter, the interpreter compiles secure-computation ASTs to circuits, on the fly, and executes them using the Goldreich, Micali and Wigderson (GMW) multi-party computation protocol [26]. The Wys- AST (and hence the interpreter) does not "bake in" standard F constructs like numbers and lists. Rather, inherited language features appear abstractly in the AST, and their semantics is handled by a foreign function interface (FFI). This permits Wys programs to take advantage of existing code and libraries available in F-.

To prove the interpreter behaves correctly, we prove, in F-, that it correctly implements the formalized distributed semantics. The circuit library and the GMW implementation are not verified—while it is possible to verify the circuit library [4], verifying a GMW implementation is an open research question. But the stage is set for verified versions to be plugged into the Wys codebase. We characterize the Trusted Computing Base (TCB) of the Wystoolchain in Sect. 3.5.

Using Wys we have implemented several programs, including PSI, joint median, and a card dealing application (Sect. 4). For PSI and joint median we implement two versions: a straightforward one and an optimized one that improves performance but increases the number of adversary-observable events. We formally prove that the optimized and unoptimized versions are equivalent, both functionally and w.r.t. privacy of parties' inputs. Our card dealing application relies on Wys-'s support for secret shares [57]. We formally prove that the card dealing algorithm always deals a fresh card.

In sum, Wys constitutes the first DSL that supports proving security and correctness properties about MPC programs, which are executed by a partially verified implementation of a full-featured language. No prior DSL provides these benefits (Sect. 5). The Wys implementation, example programs, and proofs are publicly available on Github at https://github.com/FStarLang/FStar/tree/ stratified\_last/examples/wysteria. 2

#### 2 Verifying and Deploying Wys*-*Programs

We illustrate the main concepts of Wys by showing, in several stages, how to program, optimize, and verify the two-party joint median example [32,53].

<sup>2</sup> This development was done on an older F version, but the core ideas of what we present here apply to the present version as well.

In this example, two parties, Alice and Bob, each have a set of n distinct, locally sorted integers, and they want to compute the median of the union of their sets without revealing anything else; our running example fixes n = 2, for simplicity.

#### 2.1 Secure Computations with as\_sec

In Wys-, as in its predecessor Wysteria [52], an MPC is written as a single specification that executes in one of the two *computation modes*. The primary mode is called sec mode. In it, a computation is carried out using an MPC protocol among multiple principals. Here is the joint median in Wys-:

```
1 let median a b in_a in_b =
2 as_sec {a, b} (fun () →let cmp = fst (reveal in_a) > fst (reveal in_b) in
3 let x3 = if cmp then fst (reveal in_a) else snd (reveal in_a) in
4 let y3 = if cmp then snd (reveal in_b) else fst (reveal in_b) in
5 if x3 > y3 then y3 else x3)
```
The four arguments to median are, respectively, principal identifiers for Alice and Bob, and Alice and Bob's secret inputs expressed as tuples. In Wys-, values specific to each principal are *sealed* with the principal's name (which appears in the sealed container's type). As such, the types of in\_a and in\_b are, respectively, sealed {a} (int ∗ int) and sealed {b} (int ∗ int). The as\_sec ps f construct indicates that thunk f should be run in sec mode among principals in the set ps. In this mode, the code has access to the secrets of the principals ps, which it can reveal using the reveal coercion. As we will see later, the type of reveal ensures that parties cannot reveal each others' inputs outside sec mode.<sup>3</sup> Also note that the code freely uses standard F library functions like fst and snd. The example extends naturally to n > 2 [3].

To run this program, both Alice and Bob would start a Wys interpreter at their host and direct it to run the median function Upon reaching the as\_sec thunk, the interpreters coordinate with each other to compute the result using the underlying MPC protocol. Section 2.5 provides more details.

### 2.2 Optimizing median with as\_par

Although median gets the job done, it can be inefficient for large n. However, it turns out if we reveal the result of comparison on line 2 to both the parties, then the computation on line 3 (resp. line 4) can be performed locally by Alice (resp. Bob) without the need of cryptography. Doing so can massively improve performance: previous work [32] has observed a 30× speedup for n = 64.

This optimized variant is a *mixed-mode* computation, where participants perform some local computations interleaved with small, jointly evaluated secure computations. Wys-'s second computation mode, par mode, supports such mixed-mode computations. The construct as\_par ps f states that each principal in ps should locally execute the thunk f, simultaneously; any principal not in

<sup>3</sup> The runtime representation of sealed a v at <sup>b</sup>'s host is an opaque constant • (Sect. 2.5).

the set ps simply skips the computation. Within f, while running in par mode, principals may engage in secure computations via as\_sec.

Here is an optimized version of median using as\_par:

```
1 let median_opt a b in_a in_b =
2 let cmp = as_sec {a, b} (fun () →fst (reveal in_a) > fst (reveal in_b)) in
3 let x3 = as_par {a} (fun () →if cmp then fst (reveal in_a) else snd (reveal (in_a))) in
4 let y3 = as_par {b} (fun () →if cmp then snd (reveal in_b) else fst (reveal (in_b))) in
5 as_sec {a, b} (fun () →if reveal x3 > reveal y3 then reveal y3 else reveal x3)
```
The secure computation on line 2 *only* computes cmp and returns the result to both the parties. Line 3 is then a par mode computation involving only Alice in which she discards one of her inputs based on cmp. Similarly, on line 4, Bob discards one of his inputs. Finally, line 5 compares the remaining inputs using as\_sec and returns the result as the final median.

One might wonder whether the par mode is necessary. Could we program the local parts of a mixed-mode program in normal F-, and use a special compiler to convert the sec mode parts to circuits and pass them to a GMW MPC service? We could, but it would complicate both writing MPCs and formally reasoning that the whole computation is correct and secure. In particular, programmers would need to write one program for each party that performs a different local computation (as in median\_opt). The potential interleaving among local computations and their synchronization behavior when securely computing together would be a source of possible error and thus must be considered in any proof. For example, Alice's code might have a bug in it that prevents it from reaching a synchronization point with Bob, to do a GMW-based MPC. For Wys-, the situation is much simpler. Programmers may write and maintain a single program. This program can be formally reasoned about directly using a SIMDstyle, "single-threaded" semantics, per the soundness result from Sect. 3.4. This semantics permits reasoning about the coordinated behavior of multiple principals, without worry about the effects of interleavings or wrong synchronizations. Thanks to par mode, invariants about coordinated local computations are directly evident since we can soundly assume the lockstep behavior (e.g., loop iterations in the PSI example in Sect. 4).

#### 2.3 Embedding a Type System for Wys in F*-*

Designing high-level, multi-party computations is relatively easy using Wysteria's abstractions. Before trying to run such a computation, we might wonder:


By embedding Wys in F and leveraging its extensible, monadic, dependent type-and-effect system, we address each of these three questions. We define a new indexed monad called Wys for computations that use MPC combinators as\_sec and as\_par. Using Wys along with the sealed type, we can ensure that protocols are realizable. Using F-'s capabilities for formal verification, we can reason about a computation's correctness. By characterizing observable events as part of Wys, we can define trace properties of MPC programs to reason about their security.

To elaborate on the last: we are interested in *application-level* security properties, assuming that the underlying cryptographic MPC protocol (GMW [26] in our implementation) is secure. In particular, the Wys monad models the *ideal* behavior of sec mode—a secure computation reveals only the final output and nothing else. Thus the programmer could reason, for example, that optimized MPC programs reveal no more than their unoptimized versions. To relate the proofs over ideal functionality to the actual implementation, as is standard, we rely on the security of the cryptographic protocol and the composition theorem [20] to postulate that the implementation securely realizes the ideal specification.

*The Wys monad.* The Wys monad provides several features. First, all DSL code is typed in this monad, encapsulating it from the rest of F-. Within the monad, computations and their specifications can make use of two kinds of *ghost state*: *modes* and *traces*. The mode of a computation indicates whether the computation is running in an as\_par or in an as\_sec context. The trace of a computation records the sequence and nesting structure of outputs of the jointly executed as\_sec expressions—the result of a computation and its trace constitute its observable behavior. The Wys monad is, in essence, the product of a reader monad on modes and a writer monad on traces [43,62].

Formally, we define the following F types for modes and traces. A mode Mode m ps is a pair of a mode tag (either Par or Sec) and a set of principals ps. A trace is a forest of trace element (telt) trees. The leaves of the trees record messages TMsg x that are received as the result of executing an as\_sec thunk. The tree structure represented by the TScope ps t nodes record the set of principals that are able to observe the messages in the trace t.

```
type mtag = Par | Sec
type mode = Mode: m:mtag →ps:prins → mode
type telt = TMsg : x:α → telt | TScope: ps:prins → t:list telt → telt
type trace = list telt
```
Every Wys computation e has a monadic computation type Wys t pre post. The type indicates that e is in the Wys monad (so it may perform multi-party computations); t is its result type; pre is a precondition on the mode in which e may be executed; and post is a postcondition relating the computation's mode, its result value, and its trace of observable events. When run in a context with mode m satisfying the precondition predicate pre m, e may produce the trace tr, and if and when it returns, the result is a t-typed value v validating post m v tr. The style of indexing a monad with a computation's pre- and postcondition is a standard technique [7,47,59]—we defer the definition of the monad's bind and return to the actual implementation and focus instead on specifications of Wys specific combinators. We describe as\_sec, reveal, and as\_par, and how we give them types in F-, leaving the rest to the online technical report [54]. By convention, any free variables in the type signatures are universally prenex quantified.

*Defining as\_sec in* Wys-


The type of as\_sec is *dependent* on the first parameter, ps. Its second argument f is the thunk to be evaluated in sec mode. The result's computation type has the form Wys a (requires φ) (ensures ψ), for some precondition and postcondition predicates φ and ψ, respectively. We use the requires and ensures keywords for readability—they are not semantically significant.

The precondition of as\_sec is a predicate on the mode m of the computation in whose context as\_sec ps f is called. For all the ps to jointly execute f, we require all of them to transition to perform the as\_sec ps f call simultaneously, i.e., the current mode must be Mode Par ps. We also require the precondition pre of f to be valid once the mode has transitioned to Mode Sec ps—line 2 says just this.

The postcondition of as\_sec is a predicate relating the initial mode m, the result r:a, and the trace tr of the computation. Line 3 states that the trace of a secure computation as\_sec ps f is just a singleton [TMsg r], reflecting that its execution reveals only result r. Additionally, it ensures that the result r is related to the mode in which f is run (Mode Sec ps) and some trace t according to post, the postcondition of f. The API models the "ideal functionality" of secure computation protocols (such as GMW) where the participants only observe the final result.

*Defining reveal in* Wys-. As discussed earlier, a value v of type sealed ps t encapsulates a t value that can be accessed by calling reveal v. This call should only succeed under certain circumstances. For example, in par mode, Bob should not be able to reveal a value of type sealed {Alice} int. The type of reveal makes the access control rules clear:

```
val unseal: sealed ps α →Ghost α
val reveal: x:sealed ps α →Wys α
 (requires (fun m →m.mode=Par =⇒ m.ps ⊆ ps ∧ m.mode=Sec =⇒ m.ps ∩ ps = ∅))
```

```
(ensures (fun m r tr →r=unseal x ∧ tr=[]))
```
The unseal function is a Ghost function, meaning that it can only be used in specifications for reasoning purposes. On the other hand, reveal can be called in the concrete Wys programs. Its precondition says that when executing in Mode Par ps', *all* current participants must be listed in the seal, i.e., ps' ⊆ ps. However, when executing in Mode Sec ps', only a subset of current participants is required: ps' ∩ ps = ∅. This is because the secure computation is executed jointly by all of ps', so it can access any of their individual data. The postcondition of reveal relates the result r to the argument x using the unseal function.

*Defining as\_par in* Wys-

```
1 val as_par: ps:prins →(unit →Wys a pre post) →Wys (sealed ps a)
```

```
2 (requires (fun m →m.mode=Par ∧ ps ⊆ m.ps ∧ can_seal ps a ∧ pre (Mode Par ps)))
```

```
3 (ensures (fun m r tr → ∃t. tr=[TScope ps t] ∧ post (Mode Par ps) (unseal r) t)))
```
The type of as\_par enforces the current mode to be Par, and ps to be a subset of current principals. Importantly, the API scopes the trace t of f to model the fact that any observables of f are only visible to the principals in ps. Note that as\_sec did not require such scoping, as there ps and the set of current principals in m are the same. The can\_seal predicate enforces that a is a zero-order type (i.e. closures cannot be sealed), and that in case a is already a sealed type, its set of principals is a subset of ps.

#### 2.4 Correctness and Security Verification

Using the Wys monad and the sealed type, we can write down precise types for our median and median\_opt programs, proving various useful properties. We discuss the statements of the main lemmas and the overall proof structure. By programming the protocols as a single specification using the high-level abstractions provided by Wys-, our proofs are relatively straightforward—in all the proofs of this section, F required no additional hints. In particular, we rely heavily on the view that both parties execute (different fragments of) the same code, thus avoiding the unwieldy task of reasoning about low-level message passing.

*Correctness and Security of median.* We first define a pure specification of median of two int tuples:

let median\_of (x1, x2) (y1, y2) = let (\_, m, \_, \_) = sort x1 x2 y1 y2 in m

Further, we capture the preconditions using the following predicate:

let median\_pre (x1, x2) (y1, y2) = x1 < x2 ∧ y1 < y2 ∧ distinct x1 x2 y1 y2

Using these, we prove the following top-level specification for median:

```
val median: in_a:sealed {a} (int ∗ int) → in_b:sealed {b} (int ∗ int) → Wys int
  (requires (fun m → m = Mode Par {a, b})) (∗ should be called in the Par mode ∗)
  (ensures (fun m r tr → let in_a, in_b = unseal in_a, unseal in_b in
      (median_pre in_a in_b =⇒ r = median_of in_a in_b) ∧
(∗ functional correctness ∗)
      tr = [TMsg r])) (∗ trace is just the final value ∗)
```
This signature establishes that when Alice and Bob simultaneously execute median (in Par mode), with secrets in\_a and in\_b, then, if and when the protocol terminates, (a) if their inputs satisfy the precondition median\_pre, then the result is the joint median of their inputs and (b) the observable trace consists only of the final result, as there is but a single as\_sec thunk in median, i.e., it is *secure*.

*Correctness and Security of median\_opt.* The security proof of median\_opt is particularly interesting, because the program intentionally reveals more than just the final result, i.e., the output of the first comparison. We would like to verify that this additional information does not compromise the privacy of the parties' inputs. To do this, we take the following approach.

First, we characterize the observable trace of median\_opt as a pure, specification-only function. Then, using relational reasoning, we prove a *noninteference with delimited release* property [55] on these traces. Essentially we prove that, for two runs of median\_opt where Bob's inputs and the output median are the same, the observable traces are also the same irrespective of Alice's inputs. Thus, from Alice's perspective, the observable trace does not reveal more to Bob than what the output already does. We prove this property symmetrically for Bob.

We start by defining a trace function for median\_opt:

let opt\_trace a b (x1, \_) (y1, \_) r = [ TMsg (x1 > y1); *(*∗ *observable from the first as\_sec* ∗*)* TScope {a} []; TScope {b} []; *(*∗ *observables from two local as\_par* ∗*)* TMsg r ] *(*∗ *observable from the final as\_sec* ∗*)*

A trace will have four elements: output of the first as\_sec computation, two empty scoped traces for the two local as\_par computations, and the final output.

Using this function, we prove correctness of median\_opt, thus:

val median\_opt: in\_a:sealed {a} (int ∗ int) → in\_b:sealed {b} (int ∗ int) → Wys int (requires (fun m → m = Mode Par {a, b})) *(*∗ *should be called in the Par mode* ∗*)* (ensures (fun m r tr →let in\_a = unseal in\_a in let in\_b = unseal in\_b in (median\_pre in\_a in\_b =⇒ r = median\_of in\_a in\_b) ∧ *(*∗ *functional correctness* ∗*)* tr = opt\_trace a b in\_a in\_b r *(*∗ *opt\_trace precisely describes the observable trace* ∗*)*

The delimited release property is then captured by the following lemma:

val median\_opt\_is\_secure\_for\_alice: a:prin <sup>→</sup>b:prin <sup>→</sup>in\_a1:(int <sup>∗</sup> int) <sup>→</sup>in\_a2:(int <sup>∗</sup> int) <sup>→</sup>in\_b:(int <sup>∗</sup> int) (<sup>∗</sup> possibly diff a1, a2 <sup>∗</sup>) <sup>→</sup>Lemma (requires (median\_pre in\_a<sup>1</sup> in\_b <sup>∧</sup> median\_pre in\_a<sup>2</sup> in\_b <sup>∧</sup> median\_of in\_a<sup>1</sup> in\_b = median\_of in\_a<sup>2</sup> in\_b)) (<sup>∗</sup> but same median <sup>∗</sup>) (ensures (opt\_trace a b in\_a<sup>1</sup> in\_b (median\_of in\_a<sup>1</sup> in\_b) = (<sup>∗</sup> ensures .. <sup>∗</sup>) opt\_trace a b in\_a<sup>2</sup> in\_b (median\_of in\_a<sup>2</sup> in\_b))) (<sup>∗</sup> .. same trace <sup>∗</sup>)

The lemma proves that for two runs of median\_opt where Bob's input and the final output remain same, but Alice's inputs vary arbitrarily, the observable traces are the same. As such, no more information about information leaks about Alice's inputs via the traces than what is already revealed by the output. We also prove a symmetrical lemma median\_opt\_is\_secure\_for\_bob.

In short, because the Wys monad provides programmers with the observable traces in the logic, they can then be used to prove properties, relational or otherwise, in the pure fragment of F outside the Wys monad. We present more examples and their verification details in Sect. 4.

Fig. 1. Architecture of an Wysdeployment

#### 2.5 Deploying Wys*-*Programs

Having defined a proved-secure MPC program in Wys-, how do we run it? Doing so requires the following steps (Fig. 1). First, we run the F compiler in a special mode that *extracts* the Wys code (say psi.fst), into the Wys- AST as a data structure (in psi.ml). Except for the Wys specific nodes (as\_sec, as\_par, etc.), the rest of the program is extracted into *FFI nodes* that indicate the use of, or calls into, functionality provided by Fitself.

The next step is for each party to run the extracted AST using the Wys- interpreter. This interpreter is written in F and we have proved (see Sect. 3.5) that it implements a deep embedding of the Wys semantics, also specified in F- (Figs. 5 and 6, Sect. 3). The interpreter is extracted to OCaml by the usual F extraction. Each party's interpreter executes the AST locally until it reaches an as\_sec ps f node, where the interpreter's back-end compiles f, on-the-fly, for particular values of the secrets in f's environment, to a boolean circuit. Firstorder, loop-free code can be compiled to a circuit; Wys provides specialized support for several common combinators (e.g., fst, snd, list combinators such as List.intersect, List.mem, List.nth etc.).

The circuit is handed to a library by Choi et al. [22] that implements the GMW [26] MPC protocol. Running the GMW protocol involves the parties in ps generating and communicating (XOR-based) secret shares [57] for their secret inputs, and then cooperatively evaluating the boolean circuit for f over them. While our implementation currently uses the GMW protocol, it should be possible to plugin other MPC protocols as well.

One obvious question is how both parties are able to get this process off the ground, given that they don't know some of the inputs (e.g., other parties' secrets). The sealed abstraction helps here. Recall that for median, the types of the inputs are of the form sealed {a} (int ∗ int) and sealed {b} (int ∗ int). When the program is run on Alice's host, the former will be a pair of Alice's values, whereas the latter will be an opaque constant (which we denote as •). The reverse will

Fig. 2. Wyssyntax

be true on Bob's host. When the circuit is constructed, each principal links their non-opaque inputs to the relevant input wires of the circuit. Similarly, the output map component of each party is derived from their output wires in the circuit, and thus, each party only gets to see their own output.

#### 3 Formalizing and Implementing Wys*-*

In the previous section, we presented examples of verifying properties about Wys programs using F-'s logic. However, these programs are not executed using the F- (single-threaded) semantics; they have a distributed semantics involving multiple parties. So, how do the properties that we verify using Fcarry over?

In this section, we present the metatheory that answers this question. First, we formalize the Wys single-threaded (ST) semantics, that faithfully models the F semantics of the Wys- API presented in Sect. 2. Next, we formalize the distributed (DS) semantics that multiple parties use to run Wys programs. Then we prove the former is *sound* with respect to the latter, so that properties proved of programs under ST apply when run under DS. We have mechanized the proof of this theorem in F-.

#### 3.1 Syntax

Figure 2 shows the complete syntax of Wys-. Principals and principal sets are first-class values, and are denoted by p and s respectively. Constants in the language also include () (unit), booleans ( and ⊥), and FFI constants c. Expressions e include the regular forms for functions, applications, let bindings, etc. and the Wys--specific constructs. Among the ones that we have not seen in Sect. 2, expression mkmap e<sup>1</sup> e<sup>2</sup> creates a map from principals in e<sup>1</sup> (which is a principal set) to the value computed by e2. project e<sup>1</sup> e<sup>2</sup> projects the value of principal e<sup>1</sup> from the map e2, and concat e<sup>1</sup> e<sup>2</sup> concatenates the two maps. The maps are used if an as\_sec computation returns different outputs to the parties.

Host language (i.e., F-) constructs are also part of the syntax of Wys-, including constants c for strings, integers, lists, tuples, etc. Likewise, host language functions/primitives can be called from Wys-—ffi f e¯ is the invocation of a host-language function f with arguments e¯. The FFI confers two benefits. First, it simplifies the core language while still allowing full consideration of security relevant properties. Second, it helps the language scale by incorporating many of the standard features, libraries, etc. from the host language.

Map m ::= · | m[p -→ v] Value v ::= p | s | () ||⊥| m | v | (L, λx.e) | (L, fix f.λx.e) | sealed s v | • Mode M ::= Par s | Sec s Context E ::= | as par e | as par v | as sec e | as sec v | ... Frame F ::= (M, L, E, T) Stack X ::= · | F, X Environment L ::= · | L[x -→ v] Trace element t ::= TMsg v | TScope s T Trace T ::= · | t, T Configuration C ::= M; X;L; T; e Par component P ::= · | P[p -→ C] Sec component S ::= · | S[s -→ C] Protocol π ::= P; S



Fig. 4. Wys-ST semantics (selected rules)

#### 3.2 Single-Threaded Semantics

We formalize the semantics in the style of Hieb and Felleisen [24], where the redex is chosen by (standard, not shown) *evaluation contexts* E, which prescribe left-to-right, call-by-value evaluation order. The ST semantics, a model of the F- semantics and the Wys- API, defines a judgment C → C that represents a single step of an abstract machine (Fig. 4). Here, C is a *configuration* M; X;L; T; e. This five-tuple consists of a mode M, a stack X, a local environment L, a trace T, and an expression e. The syntax for these elements is given in Fig. 3. The value form v represents the host language (FFI) values. The stack and environment are standard; trace T and mode M were discussed in the previous section.

For space reasons, we focus on the two main Wys constructs as\_par and as\_sec. Our technical report [54] shows other Wysspecific constructs.

Rules S-aspar and S-parret (Fig. 4) reduce an as\_par expression once its arguments are fully evaluated—its first argument s is a principal set, while the second argument (L1, λx.e) is a closure where L<sup>1</sup> captures the free variables of thunk λx.e. S-aspar first checks that the current mode M is Par and contains all the principals from the set s. It then pushes a seal s frame on the stack, and

P-par C - C- P[p -→ C]; S −→ P[p -→ C- ]; S ∀p ∈ s. P[p].e = as sec s (L*p*, λx.e) <sup>s</sup> <sup>∈</sup> dom(S) <sup>L</sup> <sup>=</sup> combine <sup>L</sup>¯*<sup>p</sup>* P; S −→ P; S[s -→ Sec s; ·;L[x -→ ()]; ·; e] P-enter P-sec C → C- P; S[s -→ C] −→ P; S[s -→ C- ] P-exit S[s] = Sec s; ·;L; T; v P- = ∀p ∈ s. P[p -→ P[p] (slice v p v)] S- = S \ s P; S −→ P- ; S-

$$\begin{array}{ll} \text{L-ASPR1} & \text{L} \text{-PARER} \\ & e\_1 = \text{as.par} \ s \ (L\_1, \lambda x.e) \quad p \in s \\ & X\_1 = (M; L; \text{seal} \ s \ \langle\rangle; T), X \\ \hline \text{Par } p; X; L; T; e\_1 \leadsto \text{ Par } p; X\_1; L\_1 \middle| x \mapsto \langle\rangle\rangle; \ddots e \\ & \mathsf{Par } p; X; L; T; \text{as.par} \ \mathsf{int} \ \langle\rangle\rangle\langle\cdot; e \qquad \mathsf{Par } p; X; L; T; v \leadsto \ \mathsf{Par } p; X\_1; L\_1; T\_2; v\_1 \ \mathsf{end} \rangle \\ & & \mathsf{Ap } p; X; L; T; \text{as.par } s \ (L\_1, \lambda x.e) \sim \mathsf{Par } p; X; L; T; \text{asales } s \ \mathsf{e} \end{array}$$

Fig. 6. Distributed semantics, selected local rules (the mode M is always Par p)

starts evaluating <sup>e</sup> under the environment <sup>L</sup>1[<sup>x</sup> → ()]. The rule S-asparret pops the frame and seals the result, so that it is accessible only to the principals in s. The rule also creates a trace element TScope s T, essentially making observations during the reduction of e (i.e., T) visible only to principals in s.

Turning to as\_sec, the rule S-assec checks the precondition of the API, and the rule S-assecret generates a trace observation TMsg v, as per the postcondition of the API. As mentioned before, as\_sec semantics models the ideal, trusted third-party semantics of secure computations where the participants only observe the final output. We can confirm that the rules implement the types of as\_par and as\_sec shown in Sect. 2.

#### 3.3 Distributed Semantics

In the DS semantics, principals evaluate the same program locally and asynchronously until they reach a secure computation, at which point they synchronize to jointly perform the computation. The semantics consists of two parts: (a) a judgment of the form π −→ π (Fig. 5), where a protocol π is a tuple (P; S) such that P maps each principal to its local configuration and S maps a set of principals to the configuration of an ongoing, secure computation; and (b) a local evaluation judgment C - C (Fig. 6) to model how a single principal behaves while in par mode.

Rule P-Par in Fig. 5 models a single party taking a step, per the local evaluation rules. Figure 6 shows these rules for as\_par. (See technical report [54] for more local evaluation rules.) A principal either participates in the as\_par computation, or skips it. Rules L-aspar1 and L-parret handle the case when p ∈ s, and so, the principal p participates in the computation. The rules closely mirror the corresponding ST semantics rules in Fig. 4. One difference in the rule L-asparret is that the trace T is not scoped. In the DS semantics, traces only contain TMsg elements; i.e., a trace is the (flat) list of secure computation outputs observed by that active principal. If p ∈ s, then the principal skips the computation with the result being a sealed value containing the opaque constant • (rule L-aspar2). The contents of the sealed value do not matter, since the principal will not be allowed to unseal the value anyway.

As should be the case, there are no local rules for as\_sec—to perform a secure computation parties need to combine their data and jointly do the computation. Rule P-enter in Fig. 5 handles the case when principals enter a secure computation. It requires that all the principals p ∈ s must have the expression form as\_sec s (Lp, λx.e), where L<sup>p</sup> is their local environment associated with the closure. Each party's local environment contains its secret values (in addition to some public values). Conceptually, a secure computation *combines* these environments, thereby producing a joint view, and evaluates e under the combination. We define an auxiliary combine function for this purpose:

combine\_v (•, v) = v combine\_v (v, •) = v combine\_v (sealed s v1, sealed s v2) = sealed s (combine\_v v<sup>1</sup> v2) ...

The rule P-enter combines the principals' environments, and creates a new entry in the S map. The principals are now waiting for the secure computation to finish. Rule P-sec models a stepping rule inside the sec mode.

The rule P-exit applies when a secure computation has completed and returns results to the waiting principals. If the secure computation terminates with value v, each principal p gets the value slice\_v p v. The slice\_v function is analogous to combine, but in the opposite direction—it strips off the parts of v that are not accessible to p:

```
slice_v p (sealed s v) = sealed s •, if p ∈ s
slice_v p (sealed s v) = sealed s (slice_v p v), if p ∈ s
...
```
In the rule P-exit, the notation is defined as:

M; X;L; T; \_ v = M; X;L; append T [TMsg v]; v

That is, the returned value is also added to the principal's trace to note their observation of the value.

### 3.4 Metatheory

Our goal is to show that the ST semantics faithfully represents the semantics of Wys programs as they are executed by multiple parties, i.e., according to the DS semantics. We do this by proving *simulation* of the ST semantics by the DS semantics, and by proving *confluence* of the DS semantics. Our F development mechanizes all the metatheory presented in this section.

*Simulation.* We define a slice s C function that returns the corresponding protocol π<sup>C</sup> for an ST configuration C. In the P component of π<sup>C</sup> , each principal p ∈ s is mapped to their *slice* of the protocol. For slicing values, we use the same slice\_v function as before. Traces are sliced as follows:

slice\_tr p (TMsg v)=[TMsg (slice\_v p v)] slice\_tr p (TScope s T) = slice\_tr p T, if p ∈ s slice\_tr p (TScope s T) = [], if p ∈ s

The slice of an expression (e.g., the source program) is itself. For all other components of C, slice functions are defined analogously.

We say that C is *terminal* if it is in Par mode and is fully reduced to a value (i.e. when C = \_; X; \_; \_; e, e is a value and X is empty). Similarly, a protocol π = (P, S) is terminal if S is empty and all the local configurations in P are terminal. The simulation theorem is then the following:

Theorem 1 (Simulation of ST by DS). *Let* s *be the set of all principals. If* C<sup>1</sup> →<sup>∗</sup> C2*, and* C<sup>2</sup> *is terminal, then there exists some derivation* (slice s C1) −→<sup>∗</sup> (slice s C2) *such that* (slice s C2) *is terminal.*

To state *confluence*, we first define the notion of *strong termination*.

Definition 1 (Strong termination). *If all possible runs of protocol* π *terminate at* πt*, we say* π strongly terminates in πt*, written* π ⇓ πt*.*

Our confluence result then says:

Theorem 2 (Confluence of DS). *If* <sup>π</sup> −→<sup>∗</sup> <sup>π</sup><sup>t</sup> *and* <sup>π</sup><sup>t</sup> *is terminal, then* <sup>π</sup> ⇓ <sup>π</sup>t*.*

Combining the two theorems, we get a corollary that establishes the soundness of the ST semantics w.r.t. the DS semantics:

Corollary 1 (Soundness of ST semantics). *Let* s *be the set of all principals. If* C<sup>1</sup> →<sup>∗</sup> C2*, and* C<sup>2</sup> *is terminal, then* (slice s C1) ⇓ (slice s C2)*.*

Now suppose that for a Wys source program, we prove in F a postcondition that the result is sealed alice n, for some n > 0. By the soundness of the ST semantics, we can conclude that when the program is run in the DS semantics, it may diverge, but if it terminates, alice's output will also be sealed alice n, and for all other principals their outputs will be sealed alice •. Aside from the correspondence on results, our semantics also covers correspondence on traces. Thus the correctness and security properties that we prove about a Wys program using F-'s logic, hold for the program that actually runs.

### 3.5 Implementation

The formal semantics presented in the prior section is mechanized as an inductive type in F-. This style is useful for proving properties, but does not directly translate to an implementation. Therefore, we implement an interpretation function step in Fand prove that it corresponds to the rules; i.e., that for all input configurations C, step(C) = C implies that C → C according to the semantics. Then, the core of each principal's implementation is an F stub function tstep that repeatedly invokes step on the AST of the source program (produced by the F extractor run in a custom mode), unless the AST is an as\_sec node. Functions step and tstep are extracted to OCaml by the standard Fextraction process.

Local evaluation is not defined for the as\_sec node, so the stub implements what amounts to P-enter and P-exit from Fig. 5. When the stub notices the program has reached an as\_sec expression, it calls into a circuit library we have written that converts the AST of the second argument of as\_sec to a boolean circuit. This circuit and the encoded inputs are communicated to a co-hosted server that implements the GMW MPC protocol [22]. The server evaluates the circuit, coordinating with the GMW servers of the other principals, and sends back the result. The circuit library decodes the result and returns it to the stub. The stub then carries on with the local evaluation. Our FFI interface currently provides a form of monomorphic, first-order interoperability between the (dynamically typed) interpreter and the host language.

Our F formalization of the Wys semantics, including the AST specification, is 1900 lines of code. This formalization is used both by the metatheory as well as by the (executable) interpreter. The metatheory that connects the ST and DS semantics (Sect. 3) is 3000 lines. The interpreter and its correctness proof are another 290 lines of F code. The interpreter step function is essentially a big switch-case on the current expression, that calls into the functions from the semantics specification. The tstep stub is another 15 lines. The size of the circuit library, not including the GMW implementation, is 836 lines. The stub, the implementation of GMW, the circuit library, and F toolchain (including the custom Wysextraction mode) are part of our Trusted Computing Base (TCB).

### 4 Applications

In addition to joint median, presented in Sect. 2, we have implemented and proved properties of two other MPC applications, *dealing for online card games* and *private set intersection* (PSI).

*Card Dealing.* We have implemented an MPC-based card dealing application in Wys-. Such an application can play the role of the dealer in a game of online poker, thereby eliminating the need to trust the game portal for card dealing. The application relies on Wys-'s support for *secret shares* [57]. Using secret shares, the participating parties can share a value in a way that none of the parties can observe the actual value individually (each party's share consists of some random-looking bytes), but they can recover the value by combining their shares in sec mode.

In the application, the parties maintain a list of secret shares of already dealt cards (the number of already dealt cards is public information). To deal a new card, each party first generates a random number locally. The parties then perform a secure computation to compute the sum of their random numbers modulo 52, let's call it n. The output of the secure computation is secret shares of n. Before declaring n as the newly dealt card, the parties needs to ensure that the card n has not already been dealt. To do so, they iterate over the list of secret shares of already dealt cards, and for each element of the list, check that it is different from n. The check is performed in a secure computation that simply combines the shares of n, combines the shares of the list element, and checks the equality of the two values. If n is different from all the previously dealt cards, it is declared to be the new card, else the parties repeat the protocol by again generating a fresh random number each.

Wysprovides the following API for secret shares:

```
type Sh: Type →Type
type can_sh: Type →Type
assume Cansh_int: can_sh int
val v_of_sh: sh:Sh α →Ghost α
val ps_of_sh: sh:Sh α →Ghost prins
val mk_sh: x:α →Wys (Sh α)
   (requires (fun m →m.mode = Sec ∧ can_sh α))
   (ensures (fun m r tr →v_of_sh r = x ∧ ps_of_sh r = m.ps ∧ tr = [])
val comb_sh: x:Sh α →Wys α (requires (fun m →m.mode = Sec ∧ ps_of_sh x = m.ps))
                            (ensures (fun m r tr →v_of_sh x = r ∧ tr = [])
```
Type Sh α types the shares of values of type α. Our implementation currently supports shares of int values only; the can\_sh predicate enforces this restriction on the source programs. Extending secret shares support to other types (such as pairs) should be straightforward (as in [52]). Functions v\_of\_sh and ps\_of\_sh are marked Ghost, meaning that they can only be used in specifications for reasoning purposes. In the concrete code, shares are created and combined using the mk\_sh and comb\_sh functions. Together, the specifications of these functions enforce that the shares are created and combined by the same set of parties (through ps\_of\_sh), and that comb\_sh recovers the original value (through v\_of\_sh). The Wys interpreter transparently handles the low-level details of extracting shares from the GMW implementation of Choi et al. (mk\_sh), and reconstituting the shares back (comb\_sh).

In addition to implementing the card dealing application in Wys-, we have formally verified that the returned card is fresh. The signature of the function that checks for freshness of the newly dealt card is as follows (abc is the set of three parties in the computation):

val check\_fresh: l:list (Sh int){∀ s'. mem s' l =⇒ ps\_of\_sh s' = abc}


The specification says that the function takes two arguments: l is the list of secret shares of already dealt cards, and s is the secret shares of the newly dealt card. The function returns a boolean r that is true iff the concrete value (v\_of\_sh) of s is different from the concrete values of all the elements of the list l. Using F-, we verify that the implementation of check\_fresh meets this specification.

*PSI.* Consider a dating application that enables its users to compute their common interests without revealing all of them. This is an instance of the more general private set intersection (PSI) problem [28].

We implement a straightforward version of PSI in Wys-:

```
let psi a b (input_a:sealed {a} (list int)) (input_b:sealed {b} (list int)) (l_a:int) (l_b:int) =
  as_sec {a,b} (fun () →List.intersect (reveal input_a) (reveal input_b) l_a l_b)
```
where the input sets are expressed as lists with public lengths.

Huang et al. [28] provide an optimized PSI algorithm that performs much better when the density of common elements in the two sets is high. We implement their algorithm in Wys-. The optimized version consists of two nested loops – an outer loop for Alice's set and an inner loop for Bob's – where an iteration of the inner loop compares the current element of Alice's set with the current element of Bob's. The nested loops are written using as\_par so that both Alice and Bob execute the loops in lockstep (note that the set sizes are public), while the comparison in the inner loop happens using as\_sec. Instead of naive l\_a ∗ l\_b comparisons, Huang et al. [28] observe that once an element of Alice's set ax matches an element of Bob's set bx, the inner loop can return immediately, skipping the comparisons of ax with the rest of Bob's set. Furthermore, bx can be removed from Bob's set, excluding it from any further comparisons with other elements in Alice's set. Since there are no repeats in the input sets, all the excluded comparisons are guaranteed to be false. We show the full code and its performance comparison with psi in the technical report [54].

As with the median example from Sect. 2, the optimized PSI intentionally reveals more for performance gains. As such, we would like to verify that the optimizations do not reveal more about parties' inputs. We take the following stepwise refinement approach. First, we characterize the trace of the optimized implementation as a pure function trace\_psi\_opt la lb (omitted for space reasons), and show that the trace of psi\_opt is precisely trace\_psi\_opt la lb.

Then, we define an intermediate PSI implementation that has the same nested loop structure, but performs l\_a ∗ l\_b comparisons without any optimizations. We characterize the trace of this intermediate implementation as the pure function trace\_psi, and show that it precisely captures the trace.

To show that trace\_psi does not reveal more than the intersection of the input sets, we prove the following lemma.

```
Ψ la0 la1 lb0 lb1
                 def
                 = (∗ possibly diff input sets, but with ∗)
  la0 ∩ lb0 = la1 ∩ lb1 ∧ (∗ intersections the same ∗)
  length la0 = length la1 ∧ length lb0 = length lb1 (∗ lengths the same ∗)
```

```
val psi__interim_is_secure: la0:_ → lb0:_ →la1:_ →lb1:_ → Lemma
  (requires (Ψ la0 la1 lb0 lb1))
  (ensures (permutation (trace_psi la0 lb0) (trace_psi la1 lb1)))
```
The lemma essentially says that for two runs on same length inputs, if the output is the same, then the resulting traces are permutation of each other.<sup>4</sup> We can reason about the traces of psi\_interim up to permutation because Alice has no prior knowledge of the choice of representation of Bob's set (Bob can shuffle his list), so cannot learn anything from a permutation of the trace.<sup>5</sup> This establishes the security of psi\_interim.

Finally, we can connect psi\_interim to psi\_opt by showing that there exists a function f, such that for any trace tr=trace\_psi la lb, the trace of psi\_opt, trace\_psi\_opt la lb, can be computed by f (length la) (length lb) tr. In other words, the trace produced by the optimized implementation can be computed using a function of information already available to Alice (or Bob) when she (or he) observes a run of the secure, unoptimized version psi\_interim la lb. As such, the optimizations do not reveal further information.

### 5 Related Work

*Source MPC Verification.* While the verification of the underlying crypto protocols has received some attention [4,5], verification of the correctness and security properties of MPC source programs has remained largely unexplored, surprisingly so given that the goal of MPC is to preserve the privacy of secret inputs. The only previous work that we know of is Backes et al. [9] who devise an applied pi-calculus based abstraction for MPC, and use it for formal verification. For an auction protocol that computes the min function, their abstraction comprises about 1400 lines of code. Wys-, on the other hand, enables direct verification of the higher-level MPC source programs, and not their models, and in addition provides a partially verified toolchain.

*Wysteria.* Wys-'s computational model is based on the programming abstractions of a previous MPC DSL, Wysteria [52]. Wys-'s realization as an embedded DSL in F makes important advances. In particular, Wys- (a) enhances the Wysteria semantics to include a notion of observable traces, and provides the novel capability to prove security and correctness properties about mixed-mode MPC source programs, (b) expands the programming constructs available by drawing on features and libraries of F-, and (c) adds assurance via a (partially) proved-correct interpreter.

*Verified MPC Toolchain.* Almeida et al. [4] build a verified toolchain consisting of (a) a verified circuit compiler from (a subset of) C to boolean circuits, and (b) a verified implementation of Yao's [65] garbled circuits protocol for 2-party MPC. They use CompCert [36] for the former, and EasyCrypt [11] for the latter. These are significant advances, but there are several distinctions from our work. The MPC programs in their toolchain are not *mixed-mode*, and thus it cannot express

<sup>4</sup> Holding Bob's (resp. Alice's) inputs fixed and varying Alice's (resp. Bob's) inputs, as done for median in Sect. 2.4, is covered by this more general property.

<sup>5</sup> We could formalize this observation using a probabilistic, relational variant of F-[10].

examples like median\_opt and the optimized PSI. Their framework does not enable formal verification of source programs like Wys does. It may be possible to use other frameworks for verifying C programs (e.g. Frama-C [1]), but it is inconvenient as one has to work in the subset of C that falls in the intersection of these tools. Wys is also more general as it supports general n-party MPC; e.g., the card dealing application in Sect. 4 has 3 parties. Nevertheless, Wys- may use their verified Yao implementation for the special case of 2 parties.

*MPC DSLs and DSL Extensions.* In addition to Wysteria several other MPC DSLs have been proposed in the literature [14,17,27,29,34,37,39,48,49,52,56,61]. Most of these languages have standalone implementations, and the (usability/scalability) drawbacks that come with them. Like Wys-, a few are implemented as language extensions. Launchbury et al. [35] describe a Haskell-embedded DSL for writing low-level "share protocols" on a multi-server "SMC machine". OblivC [66] is an extension to C for two-party MPC that annotates variables and conditionals with an obliv qualifier to identify private inputs; these programs are compiled by source-to-source translation. The former is essentially a shallow embedding, and the latter is compiler-based; Wys is unique in that it combines a shallow embedding to support source program verification and a deep embedding to support a non-standard target semantics. Recent work [19,21] compiles to cryptographic protocols that include both arithmetic and boolean circuits; the compiler decides which fragments of the program fall into which category. It would be interesting work to integrate such a backend in Wys-.

*Mechanized Metatheory.* Our verification results are different from a typical verification result that might either mechanize metatheory for an idealized language [8], or might prove an interpreter or compiler correct w.r.t. a formal semantics [36]—we do both. We mechanize the metatheory of Wys establishing the soundness of the conceptual ST semantics w.r.t. the actual DS semantics, and mechanize the proof that the interpreter implements the correct DS semantics.

*General DSL Implementation Strategies.* DSLs (for MPC or other purposes) are implemented in various ways, such as by developing a standalone compiler/interpreter, or by shallow or deep embedding in a host language. Our approach bears relation to the approach taken in LINQ [42], which embeds a query language in normal C# programs, and implements these programs by extracting the query syntax tree and passing it to a *provider* to implement for a particular backend. Other researchers have embedded DSLs in verification-oriented host languages (e.g., Bedrock [13] in Coq [60]) to permit formal proofs of DSL programs. Low- [51] is a shallow embedding of a small, sequential, well-behaved subset of C in F that extracts to C using a F--to-C compiler. Low has been used to verify and implement several cryptographic constructions. Fromherz et al. [25] present a deep embedding of a subset of x64 assembly in F that allows efficient verification of assembly and its interoperation with C code generated from Low-. They design (and verify) a custom VC generator for the deeply embedded DSL, that allows for the proofs of assembly crypto routines to scale.

### 6 Conclusions

This paper has presented Wys-, the first DSL to enable formal verification of efficient source MPC programs as written in a full-featured host programming language, F-. The paper presented examples such as joint median, card dealing, and PSI, and showed how the DSL enables their correctness and security proofs. Wysimplementation, examples, and proofs are publicly available on Github.

Acknowledgments. We would like to thank the anonymous reviewers, Catalin Hriţcu, and Matthew Hammer for helpful comments on drafts of this paper. This research was funded in part by the U.S. National Science Foundation under grants CNS-1563722, CNS-1314857, and CNS-1111599.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Generalised Differential Privacy for Text Document Processing**

Natasha Fernandes1,2(B) , Mark Dras<sup>1</sup>, and Annabelle McIver<sup>1</sup>

> <sup>1</sup> Macquarie University, Sydney, Australia natasha.fernandes@hdr.mq.edu.au

<sup>2</sup> Inria, Paris-Saclay and Ecole Polytechnique, Palaiseau, France ´

**Abstract.** We address the problem of how to "obfuscate" texts by removing stylistic clues which can identify authorship, whilst preserving (as much as possible) the content of the text. In this paper we combine ideas from "generalised differential privacy" and machine learning techniques for text processing to model privacy for text documents. We define a privacy mechanism that operates at the level of text documents represented as "bags-of-words"—these representations are typical in machine learning and contain sufficient information to carry out many kinds of classification tasks including *topic identification* and *authorship attribution* (of the original documents). We show that our mechanism satisfies privacy with respect to a metric for semantic similarity, thereby providing a balance between utility, defined by the semantic content of texts, with the obfuscation of stylistic clues. We demonstrate our implementation on a "fan fiction" dataset, confirming that it is indeed possible to disguise writing style effectively whilst preserving enough information and variation for accurate content classification tasks. We refer the reader to our complete paper [15] which contains full proofs and further experimentation details.

**Keywords:** Generalised differential privacy · Earth Mover's metric · Natural language processing · Author obfuscation

### **1 Introduction**

Partial public release of formerly classified data incurs the risk that more information is disclosed than intended. This is particularly true of data in the form of text such as government documents or patient health records. Nevertheless there are sometimes compelling reasons for declassifying data in some kind of "sanitised" form—for example government documents are frequently released as redacted reports when the law demands it, and health records are often shared to facilitate medical research. Sanitisation is most commonly carried out by hand but, aside from the cost incurred in time and money, this approach provides no guarantee that the original privacy or security concerns are met.

We acknowledge the support of the Australian Research Council Grant DP140101119.

c The Author(s) 2019 F. Nielson and D. Sands (Eds.): POST 2019, LNCS 11426, pp. 123–148, 2019. https://doi.org/10.1007/978-3-030-17138-4\_6

To encourage researchers to focus on privacy issues related to text documents the digital forensics community PAN@Clef ([41], for example) proposed a number of challenges that are typically tackled using *machine learning*. In this paper our aim is to demonstrate how to use ideas from *differential privacy* to address some aspects of the PAN@Clef challenges by showing how to provide strong a priori privacy guarantees in document disclosures.

We focus on the problem of *author obfuscation*, namely to automate the process of changing a given document so that as much as possible of its original substance remains, but that the author of the document can no longer be identified. Author obfuscation is very difficult to achieve because it is not clear exactly what to change that would sufficiently mask the author's identity. In fact author properties can be determined by "writing style" with a high degree of accuracy: this can include author identity [28] or other undisclosed personal attributes such as native language [33,51], gender or age [16,27]. These techniques have been deployed in real world scenarios: native language identification was used as part of the effort to identify the anonymous perpetrators of the 2014 Sony hack [17], and it is believed that the US NSA used author attribution techniques to uncover the identity of the real humans behind the fictitious persona of Bitcoin "creator" Satoshi Nakamoto.<sup>1</sup>

Our contribution concentrates on the perspective of the "machine learner" as an adversary that works with the standard "bag-of-words" representation of documents often used in text processing tasks. A *bag-of-words* representation retains only the original document's words and their frequency (thus forgetting the order in which the words occur). Remarkably this representation still contains sufficient information to enable the original authors to be identified (by a stylistic analysis) *as well as* the document's topic to be classified, both with a significant degree of accuracy.<sup>2</sup> Within this context we reframe the PAN@Clef author obfuscation challenge as follows:

Given an input bag-of-words representation of a text document, provide a mechanism which changes the input without disturbing its topic classification, but that the author can no longer be identified.

In the rest of the paper we use ideas inspired by <sup>d</sup><sup>X</sup> -*privacy* [9], a metric-based extension of differential privacy, to implement an automated privacy mechanism which, unlike current ad hoc approaches to author obfuscation, gives access to both solid privacy and utility guarantees.<sup>3</sup>

<sup>1</sup> https://medium.com/cryptomuse/how-the-nsa-caught-satoshi-nakamoto-868affcef595.

<sup>2</sup> This includes, for example, the character n-gram representation used for author identification in [29].

<sup>3</sup> Our notion of utility here is similar to other work aiming at text privacy, such as [32,53].

We implement a mechanism K which takes b, b bag-of-words inputs and produces "noisy" bag-of-words outputs determined by K(b), K(b ) with the following properties:

	- **Utility:** Possible outputs determined by K(b) are distributed according to a Laplace probability density function scored according to a semantic similarity metric.

In what follows we define *semantic similarity* in terms of the classic *Earth Mover's distance* used in machine learning for topic classification in text document processing.<sup>4</sup> We explain how to combine this with <sup>d</sup><sup>X</sup> -privacy which extends privacy for databases to other unstructured domains (such as texts).

In Sect. 2 we set out the details of the bag-of-words representation of documents and define the Earth Mover's metric for topic classification. In Sect. 3 we define a generic mechanism which satisfies "E<sup>d</sup><sup>X</sup> -privacy" relative to the Earth Mover's metric <sup>E</sup><sup>d</sup><sup>X</sup> and show how to use it for our obfuscation problem. We note that our generic mechanism is of independent interest for other domains where the Earth Mover's metric applies. In Sect. 4 we describe how to implement the mechanism for data represented as real-valued vectors and prove its privacy/utility properties with respect to the Earth Mover's metric; in Sect. 5 we show how this applies to bags-of-words. Finally in Sect. 6 we provide an experimental evaluation of our obfuscation mechanism, and discuss the implications.

Throughout we assume standard definitions of probability spaces [18]. For a set <sup>A</sup> we write <sup>D</sup><sup>A</sup> for the set of (possibly continuous) probability distributions over <sup>A</sup>. For <sup>η</sup> <sup>∈</sup> <sup>D</sup>A, and <sup>A</sup> ⊆ A a (measurable) subset we write <sup>η</sup>(A) for the probability that (wrt. η) a randomly selected a is contained in A. In the special case of singleton sets, we write <sup>η</sup>{a}. If mechanism <sup>K</sup>: <sup>α</sup>→Dα, we write <sup>K</sup>(a)(A) for the probability that if the input is a, then the output will be contained in A.

### **2 Documents, Topic Classification and Earth Moving**

In this section we summarise the elements from machine learning and text processing needed for this paper. Our first definition sets out the representation for documents we shall use throughout. It is a typical representation of text documents used in a variety of classification tasks.

**Definition 1.** *Let* S *be the set of all words (drawn from a finite alphabet). A* document *is defined to be a finite bag over* S*, also called a* bag-of-words*. We denote the set of documents as* <sup>B</sup>S*, i.e. the set of (finite) bags over* <sup>S</sup>*.*

<sup>4</sup> In NLP, this distance measure is known as the Word Mover's distance. We use the classic Earth Mover's here for generality.

Once a text is represented as a bag-of-words, depending on the processing task, further representations of the words within the bag are usually required. We shall focus on two important representations: the first is when the task is semantic analysis for eg. topic classification, and the second is when the task is author identification. We describe the representation for topic classification in this section, and leave the representation for author identification for Sects. 5 and 6.

#### **2.1 Word Embeddings**

Machine learners can be trained to classify the topic of a document, such as "health", "sport", "entertainment"; this notion of topic means that the words within documents will have particular semantic relationships to each other. There are many ways to do this classification, and in this paper we use a technique that has as a key component "word embeddings", which we summarise briefly here.

A *word embedding* is a real-valued vector representation of words where the precise representation has been experimentally determined by a neural network sensitive to the way words are used in sentences [38]. Such embeddings have some interesting properties, but here we only rely on the fact that when the embeddings are compared using a distance determined by a pseudometric<sup>5</sup> on R<sup>n</sup>, words with similar meanings are found to be close together as word embeddings, and words which are significantly different in meaning are far apart as word embeddings.

**Definition 2.** *An* <sup>n</sup>*-dimensional word embedding is a mapping Vec* : S → <sup>R</sup><sup>n</sup>*. Given a pseudometric dist on* R<sup>n</sup> *we define a distance on words distVec* : S×S→R<sup>≥</sup> *as follows:*

*distVec*(w1, w2) := *dist*(*Vec*(w1), *Vec*(w2)) .

Observe that the property of a pseudometric on <sup>R</sup><sup>n</sup> carries over to <sup>S</sup>.

**Lemma 1.** *If dist is a pseudometric on* R<sup>n</sup> *then distVec is also a pseudometric on* S*.*

*Proof. Immediate from the definition of a pseudometric: i.e. the triangle equality and the symmetry of distVec are inherited from dist.*

Word embeddings are particularly suited to language analysis tasks, including topic classification, due to their useful semantic properties. Their effectiveness depends on the quality of the embedding *Vec*, which can vary depending on the size and quality of the training data. We provide more details of the particular

<sup>5</sup> Recall that a pseudometric satisfies both the triangle inequality and symmetry; but different words could be mapped to the same vector and so *distVec*(w1, w2) = 0 no longer implies that w<sup>1</sup> = w2.

embeddings in Sect. 6. Topic classifiers can also differ on the choice of underlying metric *dist*, and we discuss variations in Sect. 3.2.

In addition, once the word embedding *Vec* has been determined, and the distance *dist* has been selected for comparing "word meanings", there are a variety of semantic similarity measures that can be used to compare documents, for us bags-of-words. In this work we use the "Word Mover's Distance", which was shown to perform well across multiple text classification tasks [31].

The *Word Mover's Distance* is based on the classic *Earth Mover's Distance* [43] used in transportation problems with a given distance measure. We shall use the more general Earth Mover's definition with *dist*<sup>6</sup> as the underlying distance measure between words. We note that our results can be applied to problems outside of the text processing domain.

Let X, Y <sup>∈</sup> <sup>B</sup>S; we denote by <sup>X</sup> the tuple x<sup>a</sup><sup>1</sup> <sup>1</sup> , x<sup>a</sup><sup>2</sup> <sup>2</sup> ,...,x<sup>a</sup>*<sup>k</sup>* <sup>k</sup> , where <sup>a</sup><sup>i</sup> is the number of times that <sup>x</sup><sup>i</sup> occurs in <sup>X</sup>. Similarly we write <sup>Y</sup> <sup>=</sup> y<sup>b</sup><sup>1</sup> <sup>1</sup> , y<sup>b</sup><sup>2</sup> <sup>2</sup> ,...,y b*l* <sup>l</sup> ; we have - <sup>i</sup> <sup>a</sup><sup>i</sup> <sup>=</sup> <sup>|</sup>X<sup>|</sup> and - <sup>j</sup> <sup>b</sup><sup>j</sup> <sup>=</sup> <sup>|</sup><sup>Y</sup> <sup>|</sup>, the sizes of <sup>X</sup> and <sup>Y</sup> respectively. We define a *flow matrix* <sup>F</sup> <sup>∈</sup> <sup>R</sup><sup>k</sup>×<sup>l</sup> <sup>≥</sup><sup>0</sup> where <sup>F</sup>ij represents the (non-negative) amount of flow from <sup>x</sup><sup>i</sup> <sup>∈</sup> <sup>X</sup> to <sup>y</sup><sup>j</sup> <sup>∈</sup> <sup>Y</sup> .

**Definition 3** *(Earth Mover's Distance)***.** *Let* <sup>d</sup><sup>S</sup> *be a (pseudo)metric over* <sup>S</sup>*. The Earth Mover's Distance with respect to* <sup>d</sup><sup>S</sup> *, denoted by* <sup>E</sup><sup>d</sup><sup>S</sup> *, is the solution to the following linear optimisation:*

$$E\_{d\_{\mathcal{S}}}(X, Y) \quad := \min \sum\_{x\_i \in X} \sum\_{y\_j \in Y} d\_{\mathcal{S}}(x\_i, y\_j) F\_{ij} \quad , \quad subject \tag{1}$$

$$\sum\_{i=1}^{k} F\_{ij} = \frac{b\_j}{|Y|} \quad \text{and} \quad \sum\_{j=1}^{l} F\_{ij} = \frac{a\_i}{|X|} \quad , \quad F\_{ij} \ge 0, \quad 1 \le i \le k, 1 \le j \le l \tag{2}$$

*where the minimum in (1) is over all possible flow matrices* F *subject to the constraints (2). In the special case that* <sup>|</sup>X<sup>|</sup> <sup>=</sup> <sup>|</sup><sup>Y</sup> <sup>|</sup>*, the solution is known to satisfy the conditions of a (pseudo)metric [43] which we call the Earth Mover's Metric.*

In this paper we are interested in the special case <sup>|</sup>X<sup>|</sup> <sup>=</sup> <sup>|</sup><sup>Y</sup> <sup>|</sup>, hence we use the term *Earth Mover's metric* to refer to <sup>E</sup><sup>d</sup><sup>S</sup> .

We end this section by describing how texts are prepared for machine learning tasks, and how Definition 3 is used to distinguish documents. Consider the text snippet "The President greets the press in Chicago". The first thing is to remove all "stopwords" – these are words which do not contribute to semantics, and include things like prepositions, pronouns and articles. The words remaining are those that contain a great deal of semantic and stylistic traits.<sup>7</sup>

<sup>6</sup> In our experiments we take *dist* to be defined by the Euclidean distance. <sup>7</sup> In fact the way that stopwords are used in texts turn out to be characteristic features of authorship. Here we follow standard practice in natural language processing to remove them for efficiency purposes and study the privacy of what remains. All of our results apply equally well had we left stopwords in place.

**Fig. 1.** Earth Mover's metric between sample documents.

In this case we obtain the bag:

$$b\_1 \quad := \quad \langle \text{Predident}^1, \text{ greets}^1, \text{ press}^1, \text{ Chicago}^1 \rangle \; .$$

Consider a second bag: <sup>b</sup><sup>2</sup> := Chief<sup>1</sup>,speaks<sup>1</sup>, media<sup>1</sup>,Illinois<sup>1</sup>, corresponding to a different text. Figure 1 illustrates the optimal flow matrix which solves the optimisation problem in Definition <sup>3</sup> relative to <sup>d</sup><sup>S</sup> . Here each word is mapped completely to another word, so that Fi,j = 1/4 when i = j and 0 otherwise. We show later that this is always the case between bags of the same size. With these choices we can compute the distance between b1, b2:

$$\begin{split} E\_{d\mathcal{S}}(b\_1, b\_2) &= \frac{1}{4} (d\_{\mathcal{S}}(\text{Predsend, Chief}) + d\_{\mathcal{S}}(\text{greets}, \text{spaaks}) + \\ d\_{\mathcal{S}}(\text{press}, \text{media}) &+ d\_{\mathcal{S}}(\text{Chicago}, \text{Illinos})) \\ &= 2.816 \ . \end{split} \tag{3}$$

For comparison, consider the distance between b<sup>1</sup> and b<sup>2</sup> to a third document, <sup>b</sup><sup>3</sup> := Chef<sup>1</sup>, breaks<sup>1</sup>, cooking<sup>1</sup>,record<sup>1</sup>. Using the same word embedding metric,<sup>8</sup> we find that <sup>E</sup><sup>d</sup><sup>S</sup> (b1, b3)=4.121 and <sup>E</sup><sup>d</sup><sup>S</sup> (b2, b3)=3.941. Thus <sup>b</sup>1, b<sup>2</sup> would be classified as semantically "closer" to each other than to b3, in line with our own (linguistic) interpretation of the original texts.

### **3 Differential Privacy and the Earth Mover's Metric**

Differential Privacy was originally defined with the protection of individuals' data in mind. The intuition is that privacy is achieved through "plausible deniability", i.e. whatever output is obtained from a query, it could have just as

<sup>8</sup> We use the same word2vec-based metric as per our experiments; this is described in Sect. 6.

easily have arisen from a database that does not contain an individual's details, as from one that does. In particular, there should be no easy way to distinguish between the two possibilities. Privacy in text processing means something a little different. A "query" corresponds to releasing the topic-related contents of the document (in our case the bag-of-words)—this relates to the utility because we would like to reveal the semantic content. The privacy relates to investing *individual documents* with plausible deniability, rather than *individual authors* directly. What this means for privacy is the following. Suppose we are given two documents b1, b<sup>2</sup> written by two distinct authors A1, A2, and suppose further that b1, b<sup>2</sup> are changed through a privacy mechanism so that it is difficult or impossible to distinguish between them (by any means). Then it is also difficult or impossible to determine whether the authors of the original documents are A<sup>1</sup> or A2, or some other author entirely. This is our aim for obfuscating authorship whilst preserving semantic content.

Our approach to obfuscating documents replaces words with other words, governed by probability distributions over possible replacements. Thus the type of our mechanism is <sup>B</sup>S → <sup>D</sup>(BS), where (recall) <sup>D</sup>(BS) is the set of probability distributions over the set of (finite) bags of S. Since we are aiming to find a careful trade-off between utility and privacy, our objective is to ensure that there is a high probability of outputting a document with a similar topic as the input document. As explained in Sect. 2, topic similarity of documents is determined by the Earth Mover's distance relative to a given (pseudo)metric on word embeddings, and so our privacy definition must also be relative to the Earth Mover's distance.

**Definition 4** *(Earth Mover's Privacy)***.** *Let* <sup>X</sup> *be a set, and* <sup>d</sup><sup>X</sup> *be a (pseudo)metric on* <sup>X</sup> *and let* <sup>E</sup><sup>d</sup><sup>X</sup> *be the Earth Mover's metric on* <sup>B</sup><sup>X</sup> *relative to* <sup>d</sup><sup>X</sup> *. Given* - <sup>≥</sup> <sup>0</sup>*, a mechanism* <sup>K</sup> : <sup>B</sup>X → <sup>D</sup>(B<sup>X</sup> ) *satisfies* -<sup>E</sup><sup>d</sup><sup>X</sup> *-privacy iff for any* b, b <sup>∈</sup> <sup>B</sup><sup>X</sup> *and* <sup>Z</sup> <sup>⊆</sup> <sup>B</sup><sup>X</sup> *:*

$$K(b)(Z) \quad \le \quad e^{\epsilon E\_{d\_X}(b,b')} K(b')(Z) \,. \tag{4}$$

Definition 4 tells us that when two documents are measured to be very close, so that -<sup>E</sup><sup>d</sup><sup>X</sup> (b, b ) is close to 0, then the multiplier e-<sup>E</sup>*d*<sup>X</sup> (b,b- ) is approximately 1 and the outputs K(b) and K(b ) are almost identical. On the other hand the more that the input bags can be distinguished by <sup>E</sup><sup>d</sup><sup>X</sup> , the more their outputs are likely to differ. This flexibility is what allows us to strike a balance between utility and privacy; we discuss this issue further in Sect. 5 below.

Our next task is to show how to implement a mechanism that can be proved to satisfy Definition 4. We follow the basic construction of Dwork et al. [12] for lifting a differentially private mechanism <sup>K</sup>: X → <sup>D</sup><sup>X</sup> to a differentially private mechanism <sup>K</sup>: <sup>X</sup> <sup>N</sup> <sup>→</sup>D<sup>X</sup> <sup>N</sup> on *vectors* in <sup>X</sup> <sup>N</sup> . (Note that, unlike a bag, a vector imposes a fixed order on its components.) Here the idea is to apply K independently to each component of a vector <sup>v</sup> ∈ X <sup>N</sup> to produce a random output vector, also in <sup>X</sup> <sup>N</sup> . In particular the probability of outputting some vector <sup>v</sup> is the product:

$$\underline{K}^\star(v)\{v'\}\quad = \prod\_{1 \le i \le N} K(v\_i)\{v'\_i\}\,. \tag{5}$$

Thanks to the compositional properties of differential privacy when the underlying metric on X satisfies the triangle inequality, it's possible to show that the resulting mechanism K satisfies the following privacy mechanism [13]:

$$
\underline{K}^\star(v)(Z) \quad \le \quad e^{M\_{d\_{\mathcal{K}}}(v,v')} \underline{K}^\star(v')(Z) \; , \tag{6}
$$

where <sup>M</sup><sup>d</sup><sup>X</sup> (v, v ) := - <sup>1</sup>≤i≤<sup>N</sup> <sup>d</sup><sup>X</sup> (vi, v <sup>i</sup>), the Manhattan metric relative to <sup>d</sup><sup>X</sup> .

However Definition 4 does not follow from (6), since Definition 4 operates on bags of size N, and the Manhattan distance between any vector representation of bags is *greater* than <sup>N</sup> <sup>×</sup> <sup>E</sup><sup>d</sup><sup>X</sup> . Remarkably however, it turns out that <sup>K</sup> –the mechanism that applies K independently to each item in a given bag– in fact satisfies the much stronger Definition 4, as the following theorem shows, provided the input bags have the same size as each other.

**Theorem 1.** *Let* <sup>d</sup><sup>X</sup> *be a pseudo-metric on* <sup>X</sup> *and let* <sup>K</sup> : X → <sup>D</sup><sup>X</sup> *be a mechanism satisfying* <sup>d</sup><sup>X</sup> *-privacy, i.e.*

$$K(x)(Z) \quad \le \quad e^{\epsilon d\_{\mathcal{K}}(x, x')} K(x')(Z) \; , \; for \; all \; x, x' \in \mathcal{X}, \; Z \subseteq \mathcal{X}. \tag{7}$$

*Let* <sup>K</sup> : <sup>B</sup>X → <sup>D</sup>(B<sup>X</sup> ) *be the mechanism obtained by applying* <sup>K</sup> *independently to each element of* <sup>X</sup> *for any* <sup>X</sup> <sup>∈</sup> <sup>B</sup><sup>X</sup> *. Denote by* <sup>K</sup> <sup>↓</sup> <sup>N</sup> *the restriction of* K *to bags of fixed size* N*. Then* K <sup>↓</sup><sup>N</sup> *satisfies* -NE<sup>d</sup><sup>X</sup> *-privacy.*

*Proof (Sketch). The full proof is given in our complete paper [15]; here we sketch the main ideas.*

*Let* b, b *be input bags, both of size* N*, and let* c *a possible output bag (of* K*). Observe that both output bags determined by* K(b1), K(b2) *and* c *also have size* N*. We shall show that (4) is satisfied for the set containing the singleton element* c *and multiplier* -N*, from which it follows that (4) is satisfied for all sets* Z*.*

*By Birkhoff-von Neumann's theorem [26], in the case where all bags have the same size, the minimisation problem in Definition 3 is optimised for transportation matrix* F *where all values* Fij *are either* 0 *or* 1/N*. This implies that the optimal transportation for* <sup>E</sup><sup>d</sup><sup>X</sup> (b, c) *is achieved by moving each word in the bag* <sup>b</sup> *to a (single) word in bag* <sup>c</sup>*. The same is true for* <sup>E</sup><sup>d</sup><sup>X</sup> (b , c) *and* <sup>E</sup><sup>d</sup><sup>X</sup> (b, b )*. Next we use a vector representation of bags as follows. For bag* b*, we write* b *for a vector in* <sup>X</sup> <sup>N</sup> *such that each element in* <sup>b</sup> *appears at some* <sup>b</sup>i*.*

*Next we fix* <sup>b</sup> *and* <sup>b</sup> *to be vector representations of respectively* b, b *in* <sup>X</sup> <sup>N</sup> *such that the optimal transportation for* <sup>E</sup><sup>d</sup><sup>X</sup> (b, b ) *is*

$$E\_{d\_X}(b, b') \quad = \quad 1/N \times \sum\_{1 \le i \le N} d\_{\mathcal{X}}(\underline{b}\_i, \underline{b}'\_i) \quad = \quad M\_{d\_X}(\underline{b}, \underline{b}')/N \,. \tag{8}$$

*The final fact we need is to note that there is a relationship between* K *acting on bags of size* <sup>N</sup> *and* <sup>K</sup> *which acts on vectors in* <sup>X</sup> <sup>N</sup> *by applying* <sup>K</sup> *independently to each component of a vector: it is characterised in the following way. Let* b, c *be bags and let* b, c *be any vector representations. For permutation* <sup>σ</sup> ∈ {<sup>1</sup> ...N}→{<sup>1</sup> ...N} *write* <sup>c</sup><sup>σ</sup> *to be the vector with components permuted by* σ*, so that* c<sup>σ</sup> <sup>i</sup> = cσ(i)*. With these definitions, the following equality between probabilities holds:*

$$K^\star(b)\{c\}\quad = \sum\_{\sigma} \underline{K}^\star(\underline{b})\{\underline{c}^\sigma\}\,,\tag{9}$$

*where the summation is over all permutations that give distinct vector representations of* c*. We now compute directly:*

$$\begin{array}{ll} \mathsf{f} = \begin{array}{l} K^{\star}(b)\{c\} \\ \sum\_{\sigma} \underline{K}^{\star}(\underline{b})\{\underline{c}^{\sigma}\} \\ \sum\_{\sigma} e^{\imath M\_{d}(\underline{b},\underline{b}')} \underline{K}^{\star}(\underline{b'})\{\underline{c}^{\sigma}\} \\ = \;e^{\imath N E\_{d}(\underline{b},\underline{b}')} \sum\_{\sigma} \underline{K}^{\star}(\underline{b'})\{\underline{c}^{\sigma}\} \\ = \;e^{\imath N E\_{d}(\underline{b},\underline{b}')} K^{\star}(\underline{b'})\{c\} \end{array} & \begin{array}{l} \text{``(9) for } b, c'' \text{''} \\ \text{``(9) for } \underline{b},\underline{b'},\underline{c}'' \end{array} \\ \text{``(9) for } \underline{b},\underline{b'} \text{'} \underline{K}^{\star} \text{'} \\ \text{``(9) for } \underline{b}',\underline{b'} \text{''} \end{array}$$

*as required.*

#### **3.1 Application to Text Documents**

Recall the bag-of-words

$$b\_2 \quad := \quad \langle \text{Chief}^1, \text{speaks}^1, \text{media}^1, \text{Illinois}^1 \rangle \;,$$

and assume we are provided with a mechanism K satisfying the standard <sup>d</sup><sup>X</sup> privacy property (7) for individual words. As in Theorem 1 we can create a mechanism K<sup>∗</sup> by applying K independently to each word in the bag, so that, for example the probability of outputting <sup>b</sup><sup>3</sup> <sup>=</sup> Chef<sup>1</sup>, breaks<sup>1</sup>, cooking<sup>1</sup>,record<sup>1</sup> is determined by (9):

$$K^\star(b\_2)(\{b\_3\}) \quad = \sum\_{\sigma} \prod\_{1 \le i \le 4} K(b\_{2i}) \{\underline{b}\_{3i}^{\sigma}\} \dots$$

By Theorem 1, K satisfies 4-<sup>E</sup><sup>d</sup><sup>S</sup> -privacy. Recalling (3) that <sup>E</sup><sup>d</sup><sup>S</sup> (b1, b2) = 2.816, we deduce that if - <sup>∼</sup> <sup>1</sup>/16 then the output distributions <sup>K</sup>(b1) and <sup>K</sup>(b2) would differ by the multiplier <sup>e</sup>2.816×4/<sup>16</sup> <sup>∼</sup> <sup>2</sup>.02; but if - <sup>∼</sup> <sup>1</sup>/32 those distributions differ by only 1.42. In the latter case it means that the outputs of K on b<sup>1</sup> and b<sup>2</sup> are almost indistinguishable.

The parameter depends on the randomness implemented in the basic mechanism K; we investigate that further in Sect. 4.

#### **3.2 Properties of Earth Mover's Privacy**

In machine learning a number of "distance measures" are used in classification or clustering tasks, and in this section we explore some properties of privacy when we vary the underlying metrics of an Earth Mover's metric used to classify complex objects.

Let v, v <sup>∈</sup> <sup>R</sup><sup>n</sup> be real-valued <sup>n</sup>-dimensional vectors. We use the following (well-known) metrics. Recall in our applications we have looked at bags-of-words, where the words themselves are represented as n-dimensional vectors.<sup>9</sup>


Note that the Euclidean and Manhattan distances determine pseudometrics on words as defined at Definition 2 and proved at Lemma 1.

**Lemma 2.** *If* <sup>d</sup><sup>X</sup> <sup>≤</sup> <sup>d</sup><sup>X</sup> - *(point-wise), then* <sup>E</sup><sup>d</sup><sup>X</sup> <sup>≤</sup> <sup>E</sup><sup>d</sup>X-*(point-wise).*

*Proof.* Trivial, by contradiction. If <sup>d</sup><sup>X</sup> <sup>≤</sup> <sup>d</sup><sup>X</sup> and Fij , F ij are the minimal flow matrices for <sup>E</sup><sup>d</sup><sup>X</sup> , E<sup>d</sup>X respectively, then F ij is a (strictly smaller) minimal solution for <sup>E</sup><sup>d</sup><sup>X</sup> which contradicts the minimality of <sup>F</sup>ij *.*

**Corollary 1.** *If* <sup>d</sup><sup>X</sup> <sup>≤</sup> <sup>d</sup><sup>X</sup> - *(point-wise), then* <sup>E</sup><sup>d</sup><sup>X</sup> *-privacy implies* <sup>E</sup><sup>d</sup>X-*-privacy.*

This shows that, for example, <sup>E</sup>·-privacy implies <sup>E</sup> · -privacy, and indeed any distance measure <sup>d</sup> which exceeds the Euclidean distance then <sup>E</sup>·-privacy implies Ed-privacy.

We end this section by noting that Definition 4 satisfies *post-processing*; i.e. that privacy does not decrease under post processing. We write K; K for the composition of mechanisms K, K : <sup>B</sup>X → <sup>D</sup>(B<sup>X</sup> ), defined:

$$(K; K')(b)(Z) \quad := \sum\_{b' \colon \mathbb{B} \mathcal{X}} K(b)(\{b'\}) \times K'(b')(Z) \,. \tag{10}$$

**Lemma 3** *[Post processing]***.** *If* K, K :BX → <sup>D</sup>(B<sup>X</sup> ) *and* <sup>K</sup> *is* -<sup>E</sup><sup>d</sup><sup>X</sup> *-private for (pseudo)metric* <sup>d</sup> *on* <sup>X</sup> *then* <sup>K</sup>; <sup>K</sup> *is* -<sup>E</sup><sup>d</sup><sup>X</sup> *-private.*

3D plot Contour diagram

**Fig. 2.** Laplace density function *Lap*<sup>2</sup> in R<sup>2</sup>

<sup>9</sup> As we shall see, in the machine learning analysis *documents* are represented as bags of n-dimensional vectors (word embeddings), where each bag contains N such vectors.

### **4 Earth Mover's Privacy for Bags of Vectors in** R*<sup>n</sup>*

In Theorem 1 we have shown how to promote a privacy mechanism on components to <sup>E</sup>d<sup>X</sup> -privacy on a bag of those components. In this section we show how to implement a privacy mechanism satisfying (7), when the components are represented by high dimensional vectors in R<sup>n</sup> and the underlying metric is taken Euclidean on <sup>R</sup>n, which we denote by ·.

We begin by summarising the basic probabilistic tools we need. A *probability density function* (PDF) over some domain <sup>D</sup> is a function <sup>φ</sup> : D → [0, 1] whose value φ(z) gives the "relative likelihood" of z. The probability density function is used to compute the probability of an outcome "<sup>z</sup> <sup>∈</sup> <sup>A</sup>", for some region <sup>A</sup> ⊆ D as follows:

$$
\int\_A \phi(x) \, dx \,. \tag{11}
$$

In differential privacy, a popular density function used for implementing mechanisms is the *Laplacian*, defined next.

**Definition 5.** *Let* <sup>n</sup> <sup>≥</sup> <sup>0</sup> *be an integer* -<sup>&</sup>gt; <sup>0</sup> *be a real, and* <sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup>*. We define the Laplacian probability density function in* n*-dimensions:*

$$\begin{array}{rcl} \boldsymbol{L} \boldsymbol{a} \boldsymbol{p}\_{\epsilon}^{n}(\boldsymbol{v}) &:=& \boldsymbol{c}\_{n}^{\epsilon} \times \boldsymbol{e}^{-\epsilon ||\boldsymbol{v}||} \end{array}$$

*where* <sup>v</sup> <sup>=</sup> (v<sup>2</sup> <sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>v</sup><sup>2</sup> <sup>n</sup>)*, and* c- <sup>n</sup> *is a real-valued constant satisfying the integral equation* 1 = ... <sup>R</sup>*<sup>n</sup> Lap*<sup>n</sup> -(v)dv<sup>1</sup> . . . dvn*.*

When n = 1, we can compute c- <sup>1</sup> = -/2, and when n = 2, we have that c- <sup>2</sup> = -<sup>2</sup>/2π.

In privacy mechanisms, probability density functions are used to produce a "noisy" version of the released data. The benefit of the Laplace distribution is that, besides creating randomness, the likelihood that the released value is different from the true value decreases exponentially. This implies that the utility of the data release is high, whilst at the same time masking its actual value. In Fig. 2 the probability density function *Lap*<sup>2</sup> - (v) depicts this situation, where we see that the highest relative likelihood of a randomly selected point on the plane being close to the origin, with the chance of choosing more distant points diminishing rapidly. Once we are able to select a vector v in R<sup>n</sup> according to *Lap*<sup>n</sup> - , we can "add noise" to any given vector v as v+v , so that the true value v is highly likely to be perturbed only a small amount.

In order to use the Laplacian in Definition 5, we need to implement it. Andr´es et al. [4] exhibited a mechanism for *Lap*<sup>2</sup> - (v), and here we show how to extend that idea to the general case. The main idea of the construction for *Lap*<sup>2</sup> - (v) uses the fact that any vector on the plane can be represented by spherical coordinates (r, θ), so that the probability of selecting a vector distance no more than r from the origin can be achieved by selecting r and θ independently. In order to obtain a distribution which overall is equivalent to *Lap*<sup>2</sup> - (v), Andr´es et al. computed that r must be selected according to a well-known distribution called the "Lambert W" function, and θ is selected uniformly over the unit circle. In our generalisation to *Lap*<sup>n</sup> - (v), we observe that the same idea is valid [6]. Observe first that every vector in R<sup>n</sup> can be expressed as a pair (r, p), where r is the distance from the origin, and p is a point in Bn, the unit *hypersphere* in Rn. Now selecting vectors according to *Lap*<sup>n</sup> - (v) can be achieved by independently selecting r and p, but this time r must be selected according to the *Gamma distribution*, and p must be selected uniformly over Bn. We set out the details next.

**Definition 6.** *The* Gamma distribution *of (integer) shape* n *and scale* δ > 0 *is determined by the probability density function:*

$$\operatorname{Garm}\_{\delta}^{n}(r) \quad := \quad \frac{r^{n-1} e^{-r/\delta}}{\delta^{n}(n-1!)} \;. \tag{12}$$

**Definition 7.** *The uniform distribution over the surface of the unit hypersphere* B<sup>n</sup> *is determined by the probability density function:*

$$Uniform^n(v) \quad := \quad \frac{\Gamma(\frac{n}{2})}{n\pi^{n/2}} \text{ if } \ v \in B^n \quad else \quad 0 \,, \tag{13}$$

*where* <sup>B</sup><sup>n</sup> := {<sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup> <sup>|</sup><sup>v</sup> = 1}*, and* <sup>Γ</sup>(α) := <sup>∞</sup> <sup>0</sup> <sup>x</sup><sup>α</sup>−1e−<sup>x</sup> dx *is the "Gamma function".*

With Definitions 6 and 7 we are able to provide an implementation of a mechanism which produces noisy vectors around a given vector in R<sup>n</sup> according to the Laplacian distribution in Definition 5. The first task is to show that our decomposition of *Lap*<sup>n</sup> is correct.

**Lemma 4.** *The* n*-dimensional Laplacian Lap*<sup>n</sup> - (v) *can be realised by selecting vectors represented as* (r, p)*, where* r *is selected according to Gam*<sup>n</sup> 1/-(r) *and* p *is selected independently according to Uniform*<sup>n</sup>(p)*.*

*Proof* (Sketch). The proof follows by changing variables to spherical coordinates and then showing that <sup>A</sup> *Lap*<sup>n</sup> - (v) dv can be expressed as the product of independent selections of r and p*.*

We use a spherical-coordinate representation of v as:

$$\begin{array}{l} r := \left\| v \right\|\ ,\text{ and}\\ v\_1 := r\cos\theta\_1\ \ ,\ \ v\_2 := r\sin\theta\_1\cos\theta\_2\ \ ,\dots \ v\_n := r\sin\theta\_1\sin\theta\_2\ \ .\ \ . \ ,\ \sin\theta\_{n-2}\sin\theta\_{n-1}\ \ .\ \end{array}$$

Next we assume for simplicity that A is a hypersphere of radius R; with that we can reason:

$$\begin{array}{ll} & \int\_{A} \operatorname{Lap}\_{\epsilon}^{n}(v) \, dv & \text{``Definition 5; } A \text{ is a hypersphere}^{n} \\ & & \int\_{\|v\| \le R} c\_{n}^{\epsilon} \times e^{-\epsilon|v|} \, dv \\ & & & \qquad \text{``} \|v\| = \sqrt{v\_{1}^{2} + \dots + v\_{n}^{2}} \, dv \\ & & & \qquad \text{``Charge of variables to spherical coordinates; see below (14)"} \\ & & & \int\_{r \le R} \int\_{A\_{\theta}} c\_{n}^{\epsilon} \times e^{-cr} \frac{\partial(1, 2, \dots, 2, \dots, n)}{\partial(r, \theta\_{1}, \dots, \theta\_{n-1})} \, dr d\theta\_{1} \dots d\theta\_{n-1} \\ & & & \qquad \int\_{r \le R} \int\_{A\_{\theta}} c\_{n}^{\epsilon} \times e^{-cr} r^{n-1} \sin^{n-2} \theta\_{1} \sin^{n-3} \theta\_{2} \dots \sin^{2} \theta\_{n-3} \sin \theta\_{n-2} \, dr d\theta\_{1} \dots d\theta\_{n-1} \, . \end{array}$$

Now rearranging we can see that this becomes a product of two integrals. The first <sup>r</sup>≤<sup>R</sup> <sup>e</sup>−<sup>r</sup>rn−<sup>1</sup> is over the radius, and is proportional to the integral of the Gamma distribution Definition 6; and the second is an integral over the angular coordinates and is proportional to the surface of the unit hypersphere, and corresponds to the PDF at (7). Finally, for the "see below's" we are using the "Jacobian":

$$\frac{\partial(z\_1, z\_2, \dots, z\_n)}{\partial(r, \theta\_1, \dots, \theta\_{n-1})} = r^{n-1} \sin^{n-2} \theta\_1 \sin^{n-3} \theta\_2 \dots \tag{14}$$

(For full details, see our complete paper [15].)

We can now assemble the facts to demonstrate the n-Dimensional Laplacian.

**Theorem 2 (n-Dimensional Laplacian).** *Given* -<sup>&</sup>gt; <sup>0</sup> *and* <sup>n</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>*, let* <sup>K</sup> : <sup>R</sup><sup>n</sup> <sup>→</sup> DR<sup>n</sup> *be a mechanism that, given a vector* <sup>x</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup> *outputs a noisy value as follows:*

$$x \overset{K}{\longmapsto} x + x'$$

*where* <sup>x</sup> *is represented as* (r, p) *with* <sup>r</sup> <sup>≥</sup> <sup>0</sup>*, distributed according to Gam*<sup>n</sup> 1/-(r) *and* <sup>p</sup> <sup>∈</sup> <sup>B</sup><sup>n</sup> *distributed according to Uniform*<sup>n</sup>(p)*. Then* <sup>K</sup> *satisfies (7) from Theorem 1, i.e.* K *satisfies* -·*-privacy where* · *is the Euclidean metric on* <sup>R</sup><sup>n</sup>*.*

*Proof* (Sketch). Let z,y <sup>∈</sup> <sup>R</sup><sup>n</sup>*.* We need to show that for any (measurable) set <sup>A</sup> <sup>⊆</sup> <sup>R</sup><sup>n</sup> that:

$$K(z)(A)/K(y)(A) \quad \le \quad e^{\epsilon \|z - y\|} \text{ .} \tag{15}$$

However (15) follows provided that the probability densities of respectively K(z) and K(y) satisfy it. By Lemma 4 the probability density of K(z)*,* as a function of x is distributed as *Lap*<sup>n</sup> - (z−x); and similarly for the probability density of K(y). Hence we reason:

$$\begin{array}{ll} & Lap\_{\epsilon}^{n}(z-x)/Lap\_{\epsilon}^{n}(y-x) & \\ = & c\_{n}^{\epsilon} \times e^{-\epsilon \|z-x\|}/c\_{n}^{\epsilon} \times e^{-\epsilon \|y-x\|} & & \text{``Definition 5''} \\ = & e^{-\epsilon \|z-x\|} \times e^{\epsilon \|y-x\|} & & & \text{``Arithmetic''} \\ \le & e^{\epsilon \|z-y\|} \,, & & \text{``Triangle inequality; } s \mapsto e^{s} \text{ is monotone''} \end{array}$$

as required.

Theorem 2 reduces the problem of adding Laplace noise to vectors in R<sup>n</sup> to selecting a real value according to the Gamma distribution and an independent uniform selection of a unit vector. Several methods have been proposed for generating random variables according to the Gamma distribution [30] as well as for the uniform selection of vectors on the unit n-sphere [35]. The uniform selection of a unit vector has also been described in [35]; it avoids the transformation to spherical coordinates by selecting n random variables from the standard normal distribution to produce vector <sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup>, and then normalising to output <sup>v</sup> |v| .

### **4.1 Earth Mover's Privacy in** BR*<sup>n</sup>*

Using the n-dimensional Laplacian, we can now implement an algorithm for -NE·-privacy. Algorithm <sup>1</sup> takes a bag of <sup>n</sup>-dimensional vectors as input and applies the n-dimensional Laplacian mechanism described in Theorem 2 to each vector in the bag, producing a noisy bag of n-dimensional vectors as output. Corollary 2 summarises the privacy guarantee.

#### **Algorithm 1.** Earth Mover's Privacy Mechanism

```
Require: vector v, dimension n, epsilon 
1: procedure GenerateNoisyVector(v, n, )
2: r ← Gamma(n, 1
                   -
                     )
3: u ← U(n)
4: return v + ru
5: end procedure
Require: bag X, dimension n, epsilon 
1: procedure GeneratePrivateBag(X, n, )
2: Z ← ()
3: for all x ∈ X do
4: z ← GenerateNoisyVector(x, n, )
5: add z to Z
6: end for
7: return Z
8: end procedure
```
**Corollary 2.** *Algorithm 1 satisfies* -NE·*-privacy, relative to any two bags in* BR<sup>n</sup> *of size* N*.*

*Proof. Follows from Theorems 1 and 2.*

#### **4.2 Utility Bounds**

We prove a lower bound on the utility for this algorithm, which applies for high dimensional data representations. Given an output element x, we define Z to be the set of outputs within distance Δ > 0 from x. Recall that the distance function is a measure of utility, therefore <sup>Z</sup> <sup>=</sup> {<sup>z</sup> <sup>|</sup> <sup>E</sup>·(x, z) <sup>≤</sup> <sup>Δ</sup>} represents the set of vectors within utility Δ of x. Then we have the following:

**Theorem 3.** *Given an input bag* b *consisting of* N n*-dimensional vectors, the mechanism defined by Algorithm <sup>1</sup> outputs an element from* <sup>Z</sup> <sup>=</sup> {<sup>z</sup> <sup>|</sup> <sup>E</sup>·(b, z) <sup>≤</sup> <sup>Δ</sup>} *with probability at least*

$$1 - e^{-\epsilon N \Delta} e\_{n-1}(\epsilon N \Delta) \ ,$$

*whenever* -NΔ <sup>≤</sup> n/e*. (Recall that* <sup>e</sup>k(α) = - <sup>0</sup>≤i≤<sup>k</sup> <sup>α</sup>*<sup>i</sup>* <sup>i</sup>! *, the sum of the first* <sup>k</sup>+1 *terms in the series for* e<sup>α</sup>*.)*

*Proof (Sketch). Let* <sup>b</sup> <sup>∈</sup> (Rn)<sup>N</sup> *be a (fixed) vector representation of the bag* <sup>b</sup>*. For* <sup>v</sup> <sup>∈</sup> (Rn)<sup>N</sup> *, let* <sup>v</sup>◦ <sup>∈</sup> BR<sup>n</sup> *be the bag comprising the* <sup>N</sup> *components if* <sup>v</sup>*. Observe that* NE·(b, v◦) <sup>≤</sup> <sup>M</sup>·(b, v)*, and so*

$$Z\_M = \{ v \mid M\_{\parallel \cdot \parallel}(\underline{b}, v) \le N\Delta \} \quad \subseteq \quad \{ v \mid E\_{\parallel \cdot \parallel}(b, v^{\diamond}) \le \Delta \} = Z\_E \ . \tag{16}$$

*Thus the probability of outputting an element of* Z *is the same as the probability of outputting* ZE*, and by (16) that is at least the probability of outputting an element from* Z<sup>M</sup> *by applying a standard n-dimensional Laplace mechanism to each of the components of* b*. We can now compute:*

$$\begin{array}{ll} & \text{Probability of outputting an element in } Z\_E \\ & \geq \\ & \int \dots \int\_{v \in Z\_M} \prod\_{1 \leq i \leq N} \underline{Lap}\_{\epsilon}^n (\underline{b}\_i - v\_i) dv\_1 \dots dv\_N \\ = & \int \dots \int\_{v \in Z\_M} \prod\_{1 \leq i \leq N} c\_n^{\epsilon} e^{-\epsilon \|\underline{b}\_i - v\_i\|} dv\_1 \dots dv\_N \, . \end{array} \tag{16}$$

*The result follows by completing the multiple integrals and applying some approximations, whilst observing that the variables in the integration are* n*-dimensional vector valued. (The details appear in our complete paper [15].)*

We note that in our application word embeddings are typically mapped to vectors in <sup>R</sup><sup>300</sup>, thus we would use <sup>n</sup> <sup>∼</sup> 300 in Theorem 3.

#### **5 Text Document Privacy**

In this section we bring everything together, and present a privacy mechanism for text documents; we explore how it contributes to the author obfuscation task described above. Algorithm 2 describes the complete procedure for taking a document as a bag-of-words, and outputting a "noisy" bag-of-words. Depending on the setting of parameter -, the output bag will be likely to be classified to be on a similar topic as the input.

Algorithm 2 uses a function *Vec* to turn the input document into a bag of word embeddings; next Algorithm 1 produces a noisy bag of word embeddings, and, in a final step the inverse *Vec*−<sup>1</sup> is used to reconstruct an actual bag-of-words as output. In our implementation of Algorithm 2, described below, we compute *Vec*−<sup>1</sup>(x) to be the word <sup>w</sup> that minimises the Euclidean distance <sup>z</sup> <sup>−</sup>*Vec*(w). The next result summarises the privacy guarantee for Algorithm 2.

**Theorem 4.** *Algorithm 2 satisfies* -NE<sup>d</sup><sup>S</sup> *-privacy, where* <sup>d</sup><sup>S</sup> <sup>=</sup> *distVec. That is to say: given input documents (bags)* b, b *both of size* N*, and* c *a possible output bag, define the following quantities as follows:* <sup>k</sup> := <sup>E</sup>·(*Vec*(b), *Vec*(b )), pr(b, c) *and* pr(b , c) *are the respective probabilities that* c *is output given the input was* b *or* b *. Then:*

$$pr(b,c) \quad \le \quad e^{\epsilon Nk} \times pr(b',c) \text{ .}$$

#### **Algorithm 2.** Document privacy mechanism

**Require:** Bag-of-words <sup>b</sup>, dimension <sup>n</sup>, epsilon , Word embedding *Vec* : S → <sup>R</sup><sup>n</sup> 1: **procedure** GenerateNoisyBagOfWords(b, n, , *Vec*)

```
2: X ← Vec(b)
```

Note that *Vec* : <sup>B</sup>S→BR<sup>n</sup> applies *Vec* to each word in a bag <sup>b</sup>, and (*Vec*<sup>−</sup><sup>1</sup>) : BR<sup>n</sup>→B<sup>S</sup> reverses this procedure as a post-processing step; this involves determining the word <sup>w</sup> that minimises the Euclidean distance <sup>z</sup> <sup>−</sup> *Vec*(w) for each <sup>z</sup> in <sup>Z</sup>.

*Proof. The result follows by appeal to Theorem 2 for privacy on the word embeddings; the step to apply Vec*−<sup>1</sup> *to each vector is a post-processing step which by Lemma 3 preserves the privacy guarantee.*

Although Theorem 4 utilises ideas from differential privacy, an interesting question to ask is how it contributes to the PAN@Clef author obfuscation task, which recall asked for mechanisms that preserve content but mask features that distinguish authorship. Algorithm 2 does indeed attempt to preserve content (to the extent that the topic can still be determined) but it does not directly "remove stylistic features".<sup>10</sup> So has it, in fact, disguised the author's characteristic style? To answer that question, we review Theorem 4 and interpret what it tells us in relation to author obfuscation.

The theorem implies that it is indeed possible to make the (probabilistic) output from two distinct documents b, b almost indistinguishable by choosing - to be extremely small in comparison with <sup>N</sup>×E·(*Vec*(b), *Vec*(b )). However, if <sup>E</sup>·(*Vec*(b), *Vec*(b )) is very large – meaning that b and b are on entirely different topics, then would need to be so tiny that the noisy output document would be highly unlikely to be on a topic remotely close to either b or b (recall Lemma 3).

This observation is actually highlighting the fact that, in some circumstances, the topic itself is actually a feature that characterises author identity. (First-hand accounts of breaking the world record for highest and longest free fall jump would immediately narrow the field down to the title holder.) This means that *any* obfuscating mechanism would, as for Algorithm 2, only be able to obfuscate documents so as to disguise the author's identity if there are several authors who write on similar topics. And it is in that spirit, that we have made the first step towards a satisfactory obfuscating mechanism: provided that documents are similar in topic (i.e. are close when their embeddings are measured by <sup>E</sup>·) they can be obfuscated so that it is unlikely that the content is disturbed, but that the contributing authors cannot be determined easily.

<sup>10</sup> Although, as others have noted [53], the bag-of-words representation already removes many stylistic features. We note that our privacy guarantee does not depend on this side-effect.

We can see the importance of the "indistinguishability" property wrt. the PAN obfuscation task. In stylometry analysis the representation of words for eg. author classification is completely different to the word embeddings which have used for topic classification. State-of-the-art author attribution algorithms represent words as "character n-grams" [28] which have been found to capture stylistic clues such as systematic spelling errors. A *character 3-gram* for example represents a given word as the complete list of substrings of length 3. For example character 3-gram representations of "color" and "colour" are:

```
· "color" → |[ "col", "olo", "lor" ]|
· "colour" → |[ "col", "olo", "lou", "our" ]|
```
For author identification, any output from Algorithm 2 would then need to be further transformed to a bag of character n-grams, as a post processing step; by Lemma 3 this additional transformation preserves the privacy properties of Algorithm 2. We explore this experimentally in the next section.

### **6 Experimental Results**

*Document Set.* The PAN@Clef tasks and other similar work have used a variety of types of text for author identification and author obfuscation. Our desiderata are that we have multiple authors writing on one topic (so as to minimise the ability of an author identification system to use topic-related cues) and to have more than one topic (so that we can evaluate utility in terms of accuracy of topic classification). Further, we would like to use data from a domain where there are potentially large quantities of text available, and where it is already annotated with author and topic.

Given these considerations, we chose "fan fiction" as our domain. Wikipedia defines *fan fiction* as follows: "Fan fiction . . . is fiction about characters or settings from an original work of fiction, created by fans of that work rather than by its creator." This is also the domain that was used in the PAN@Clef 2018 author attribution challenge,<sup>11</sup> although for this work we scraped our own dataset. We chose one of the largest fan fiction sites and the two largest "fandoms" there;<sup>12</sup> these fandoms are our topics. We scraped the stories from these fandoms, the largest proportion of which are for use in training our topic classification model. We held out two subsets of size 20 and 50, evenly split between fandoms/topics, for the evaluation of our privacy mechanism.<sup>13</sup> We follow the evaluation framework of [28]: for each author we construct an known-author text and an unknown-author snippet that we have to match to an author on

<sup>11</sup> https://pan.webis.de/clef18/pan18-web/author-identification.html.

<sup>12</sup> https://www.fanfiction.net/book/, with the two largest fandoms being Harry Potter (797,000 stories) and Twilight (220,000 stories).

<sup>13</sup> Our Algorithm 2 is computationally quite expensive, because each word w = *Vec*<sup>−</sup><sup>1</sup>(x) requires the calculation of Euclidean distance with respect to the whole vocabulary. We thus use relatively small evaluation sets, as we apply the algorithm to them for multiple values of .

the basis of the known-author texts. (See Appendix in our complete paper [15] for more detail.)

*Word Embeddings.* There are sets of word embeddings trained on large datasets that have been made publicly available. Most of these, however, are already normalised, which makes them unsuitable for our method. We therefore use the Google News word2vec embeddings as the only large-scale unnormalised embeddings available. (See Appendix in our complete paper [15] for more detail.)

*Inference Mechanisms.* We have two sorts of machine learning inference mechanisms: our adversary mechanism for author identification, and our utility-related mechanism for topic classification. For each of these, we can define inference mechanisms both within the same representational space or in a different representational space. As we noted above, in practice both author identification adversary and topic classification will use different representations, but examining same-representation inference mechanisms can give an insight into what is happening within that space.

*Different-Representation Author Identification.* For this we use the algorithm by [28]. This algorithm is widely used: it underpins two of the winners of PAN shared tasks [25,47]; is a common benchmark or starting point for other methods [19,39,44,46]; and is a standard inference attacker for the PAN shared task on authorship obfuscation.<sup>14</sup> It works by representing each text as a vector of space-separated character n-gram counts, and comparing repeatedly sampled subvectors of known-author texts and snippets using cosine similarity. We use as a starting point the code from a reproducibility study [40], but have modified it to improve efficiency. (See Appendix in our complete paper [15] for more details.)

*Different-Representation Topic Classification.* Here we choose fastText [7,22], a high-performing supervised machine learning classification system. It also works with word embeddings; these differ from word2vec in that they are derived from embeddings over character n-grams, learnt using the same skipgram model as word2vec. This means it is able to compute representations for words that do not appear in the training data, which is helpful when training with relatively small amounts of data; also useful when training with small amounts of data is the ability to start from pretrained embeddings trained on out-of-domain data that are then adapted to the in-domain (here, fan fiction) data. After training, the accuracy on a validation set we construct from the data is 93.7% (see [15] for details).

*Same-Representation Author Identification.* In the space of our word2vec embeddings, we can define an inference mechanism that for an unknown-author snippet chooses the closest known-author text by Euclidean distance.

<sup>14</sup> http://pan.webis.de/clef17/pan17-web/author-obfuscation.html.

*Same-Representation Topic Classification.* Similarly, we can define an inference mechanism that considers the topic classes of neighbours and predicts a class for the snippet based on that. This is essentially the standard k "Nearest Neighbours" technique (k-NN) [21], a non-parametric method that assigns the majority class of the k nearest neighbours. 1-NN corresponds to classification based on a Voronoi tesselation of the space, has low bias and high variance, and asymptotically has an error rate that is never more than twice the Bayes rate; higher values of k have a smoothing effect. Because of the nature of word embeddings, we would not expect this classification to be as accurate as the fastText classification above: in high-dimensional Euclidean space (as here), almost all points are approximately equidistant. Nevertheless, it can give an idea about how a snippet with varying levels of noise added is being shifted in Euclidean space with respect to other texts in the same topic. Here, we use k = 5. Same-representation author identification can then be viewed as 1-NN with author as class.

**Table 1.** Number of correct predictions of author/topic in the 20-author set (left) and 50-author set (right), using 1-NN for same-representation author identification (SRauth), 5-NN for same-representation topic classification (SRtopic), the Koppel algorithm for different-representation author identification (DRauth) and fastText for different-representation topic classification (DRtopic).


*Results:* Table 1 contains the results for both document sets, for the unmodified snippets ("none") or with the privacy mechanism of Algorithm 2 applied with various levels of -: we give results for between 10 and 30, as at - = 40 the text does not change, while at - = 1 the text is unrecognisable. For the 20-author set, a random guess baseline would give 1 correct author prediction, and 10 correct topic predictions; for the 50-author set, these values are 1 and 25 respectively.

Performance on the unmodified snippets using different-representation inference mechanisms is quite good: author identification gets 15/20 correct for the 20-author set and 27/50 for the 50-author set; and topic classification 18/20 and 43/50 (comparable to the validation set accuracy, although slightly lower, which is to be expected given that the texts are much shorter). For various levels of -, with our different-representation inference mechanisms we see broadly the behaviour we expected: the performance of author identification drops, while topic classification holds roughly constant. Author identification here does not drop to chance levels: we speculate that this is because (in spite of our choice of dataset for this purpose) there are still some topic clues that the algorithm of [28] takes advantage of: one author of Harry Potter fan fiction might prefer to write about a particular character (e.g. Severus Snape), and as these character names are not in our word2vec vocabulary, they are not replaced by the privacy mechanism.

In our same-representation author identification, though, we do find performance starting relatively high (although not as high as the differentrepresentation algorithm) and then dropping to (worse than) chance, which is the level we would expect for our privacy mechanism. The k-NN topic classification, however, shows some instability, which is probably an artefact of the problems it faces with high-dimensional Euclidean spaces. (Refer to our complete arXiv paper [15] for a sample of texts and nearest neighbours.)

### **7 Related Work**

*Author Obfuscation.* The most similar work to ours is by Weggenmann and Kerschbaum [53] who also consider the author obfuscation problem but apply standard differential privacy using a Hamming distance of 1 between all documents. As with our approach, they consider the simplified utility requirement of topic preservation and use word embeddings to represent documents. Our approach differs in our use of the Earth Mover's metric to provide a strong utility measure for document similarity.

An early work in this area by Kacmarcik et al. [23] applies obfuscation by modifying the most important stylometric features of the text to reduce the effectiveness of author attribution. This approach was used in Anonymouth [36], a semi-automated tool that provides feedback to authors on which features to modify to effectively anonymise their texts. A similar approach was also followed by Karadhov et al. [24] as part of the PAN@Clef 2017 task.

Other approaches to author obfuscation, motivated by the PAN@Clef task, have focussed on the stronger utility requirement of semantic sensibility [5,8,34]. Privacy guarantees are therefore ad hoc and are designed to increase misclassification rates by the author attribution software used to test the mechanism.

Most recently there has been interest in training neural networks models which can protect author identity whilst preserving the semantics of the original document [14,48]. Other related deep learning methods aim to obscure other author attributes such as gender or age [10,32]. While these methods produce strong empirical results, they provide no formal privacy guarantees. Importantly, their goal also differs from the goal of our paper: they aim to obscure properties of authors in the *training set* (with the intention of the author-obscured learned representations being made available), while we assume that an adversary may have access to raw training data to construct an inference mechanism with full knowledge of author properties, and in this context aim to hide the properties of some other text external to the training set.

*Machine Learning and Differential Privacy.* Outside of author attribution, there is quite a body of work on introducing differential privacy to machine learning: [13] gives an overview of a classical machine learning setting; more recent deep learning approaches include [1,49]. However, these are generally applied in other domains such as image processing: text introduces additional complexity because of its discrete nature, in contrast to the continuous nature of neural networks. A recent exception is [37], which constructs a differentially private language model using a recurrent neural network; the goal here, as for instances above, is to hide properties of data items in the training set.

*Generalised Differential Privacy.* Also known as <sup>d</sup><sup>X</sup> -privacy [9], this definition was originally motivated by the problem of geo-location privacy [4]. Despite its generality, <sup>d</sup><sup>X</sup> -privacy has yet to find significant applications outside this domain; in particular, there have been no applications to text privacy.

*Text Document Privacy.* This typically refers to the sanitisation or redaction of documents either to protect the identity of individuals or to protect the confidentiality of their sensitive attributes. For example, a medical document may be modified to hide specifics in the medical history of a named patient. Similarly, a classified document may be redacted to protect the identity of an individual referred to in the text.

Most approaches to sanitisation or redaction rely on first identifying sensitive terms in the text, and then modifying (or deleting) only these terms to produce a sanitised document. Abril et al. [2] proposed this two-step approach, focussing on identification of terms using NLP techniques. Cumby and Ghani [11] proposed *k-confusability*, inspired by *k-anonymity* [50], to perturb sensitive terms in a document so that its (utility) class is confusable with at least k other classes. Their approach requires a complete dataset of similar documents for computing (mis)classification probabilities. Anandan et al. [3] proposed *t-plausibility* which generalises sensitive terms such that any document could have been generated from at least t other documents. S´anchez and Batet [45] proposed *C-sanitisation*, a model for both detection and protection of sensitive terms (C) using information theoretic guarantees. In particular, a *C-sanitised* document should contain no collection of terms which can be used to infer any of the sensitive terms.

Finally, there has been some work on noise-addition techniques in this area. Rodriguez-Garcia et al. [42] propose semantic noise, which perturbs sensitive terms in a document using a distance measure over the directed graph representing a predefined ontology.

Whilst these approaches have strong utility, our primary point of difference is our insistence on a differential privacy-based guarantee. This ensures that every output document could have been produced from any input document with some probability, giving the strongest possible notion of plausible-deniability.

### **8 Conclusions**

We have shown how to combine representations of text documents with generalised differential privacy in order to implement a privacy mechanism for text documents. Unlike most other techniques for privacy in text processing, ours provides a guarantee in the style of differential privacy. Moreover we have demonstrated experimentally the trade off between utility and privacy.

This represents an important step towards the implementation of privacy mechanisms that could produce readable summaries of documents with a privacy guarantee. One way to achieve this goal would be to reconstruct readable documents from the bag-of-words output that our mechanism currently provides. A range of promising techniques for reconstructing readable texts from bag-of-words have already produced some good experimental results [20,52,54]. In future work we aim to explore how techniques such as these could be applied as a final post processing step for our mechanism.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Symbolic Verification of Distance Bounding Protocols**

Alexandre Debant(B) and St´ephanie Delaune

Univ Rennes, CNRS, IRISA, Rennes, France {alexandre.debant,stephanie.delaune}@irisa.fr

**Abstract.** With the proliferation of contactless applications, obtaining reliable information about distance is becoming an important security goal, and specific protocols have been designed for that purpose. These protocols typically measure the round trip time of messages and use this information to infer a distance. Formal methods have proved their usefulness when analysing standard security protocols such as confidentiality or authentication protocols. However, due to their abstract communication model, existing results and tools do not apply to distance bounding protocols.

In this paper, we consider a symbolic model suitable to analyse distance bounding protocols, and we propose a new procedure for analysing (a bounded number of sessions of) protocols in this model. The procedure has been integrated in the Akiss tool and tested on various distance bounding and payment protocols (e.g. MasterCard, NXP).

### **1 Introduction**

In recent years, contactless communications have become ubiquitous. They are used in various applications such as access control cards, keyless car entry systems, payments, and many other applications which often require some form of authentication, and rely for this on security protocols. In addition, contactless systems aims to prevent against *relay attacks* in which an adversary mount an attack by simply forwarding messages he receives: ensuring physical proximity is a new security concern for all these applications.

Formal modelling and analysis techniques are well-adapted for verifying security protocols, and nowadays several verification tools exist, e.g. ProVerif [8], Tamarin [28]. They aim at discovering logical attacks, and therefore consider a symbolic model in which cryptographic primitives are abstracted by function symbols. Since its beginning in 80s, a lot of progress has been done in this area, and it is now a common good practice to formally analyse protocols using symbolic techniques in order to spot flaws possibly before their deployment, as it was recently done e.g. in TLS 1.3 [7,17], or for an avionic protocol [9].

This work has been partially supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No 714955-POPSTAR).

These symbolic techniques are based on the so-called Dolev Yao model [20]. In such a model, the attacker is supposed to control the entire network. He can send any message he is able to build using his current knowledge, and this message will reach its final destination instantaneously. This model is accurate enough to analyse many security protocols, e.g. authentication protocols, e-voting protocols, . . . However, to analyse protocols that aim to prevent against relay attacks, some features need to be modelled in a more faithful way. Among them:


There are some implications on the attacker model. Since communications take time, it may be interesting to consider several malicious nodes. We will assume that malicious nodes collaborate but again messages can not travel (even between malicious nodes) faster than the speed of the light.

**Akiss in a Nutshell.** The procedure we present in this paper builds on previous work by Chadha et al. [12], and its implementation in the tool Akiss. Akiss allows automated analysis of privacy-type properties (modelling as equivalences) when restricted to a bounded number of sessions. Cryptographic primitives may be defined through arbitrary convergent equational theories that have the finite variant property. This class includes standard cryptographic primitives as well as less commonly supported primitives such as blind signatures and zero knowledge proofs. Termination of the procedure is guaranteed for subterm convergent theories, but also achieved in practice on several examples outside this class.

The procedure behind Akiss is based on an abstract modelling of symbolic traces into first-order Horn clauses: each symbolic trace is translated into a set of Horn clauses called *seed statements*, and a dedicated resolution procedure is applied on this set to construct a set of statements which have a simple form: the so-called *solved statements*. Once the saturation of the set of seed statements is done, it is possible to decide, based solely on those solved statements, whether processes under study are equivalent or not.

Even if we are considering reachability properties (here authentication with physical proximity), in order to satisfy timing constraints, we may need to consider recipes that are discarded when performing a classical reachability analysis. Typically, in a classical reachability analysis, there is no need to consider two recipes that deduce the same message. The main advantage of Akiss is the fact that, since its original goal is to deal with equivalence, it considers more (actually almost all possible) recipes when performing the security analysis. Moreover, even if the tool has been designed to deal with equivalence-based properties, the first part of the Akiss procedure consists in computing a knowledge base which is in fact a finite representation of all possible traces (including recipes) executable by the process under study. We build on this saturation procedure in this work.

**Our Contributions.** We design a new procedure for verifying reachability properties for protocols written in a calculus sharing many similarities with the one introduced in [19], and that gives us a way to model faithfully distance bounding protocols. Our procedure follows the general structure of the original one described in [12]. We first model protocols as traces (see Sect. 3), and then translate them into Horn clauses (see Sect. 4). A direct generalisation would consist of keeping the saturation procedure unchanged, and simply modifying the algorithm to check the satisfiability of our additional timing constraints at the end. However, as discussed in Sect. 5, such a procedure would *not* be complete for our purposes. We therefore completely redesign the update function used during the saturation procedure using a new strategy to forbid certain steps that would otherwise systematically yield to non-termination in our final algorithm. Showing these statements are indeed unnecessary requires essential changes in the proofs of completeness of the original procedure.

This new saturation procedure yields an effective method for checking reachability properties in our calculus (see Sect. 6). Although termination of saturation is not guaranteed in theory, we have implemented our procedure and we have demonstrated its effectiveness on various examples. We report on our implementation and the various case studies we have performed in Sect. 7.

As we were unable to formally establish completeness of the procedure as implemented in the original Akiss tool (due to some mismatches between the procedure described in [12] and its implementation), we decided to bring the theory closer to the practice, and this explains several differences between our seed statements and those described originally in [12].

A full version of this paper including proofs is available at [18].

### **2 Background**

We start by providing some background regarding distance bounding protocols. For illustrative purposes, we present a slightly simplified version of the TREAD protocol [2] together with the attack discovered by [26] (relying on the Tamarin prover). This protocol will be used along the paper as a running example.

#### **2.1 Distance Bounding Protocols**

Distance bounding protocols are cryptographic protocols that enable a verifier V to establish an upper bound on the physical distance to a prover P. They are typically based on timing the delay between sending out a challenge and receiving back the corresponding response. The first distance bounding protocol was proposed by Brands and Chaum [10], and since then various protocols have been proposed. In general, distance bounding protocols are made of two or three phases, the second one being a rapid phase during which the time measurement is performed. To improve accuracy, this challenge/response exchange during which

**Fig. 1.** TREAD protocol (left) and a mafia fraud attack (right)

the measurement is performed is repeated several times, and often performed at the bit level. Symbolic analysis does not allow us to reason at this level, and thus the rapid phase will be abstracted by a single challenge/response exchange, and operations done at bit level will be abstracted too.

For illustration purposes, we consider the TREAD protocol. As explained before, we ignore several details that are irrelevant to our symbolic security analysis, and we obtain the protocol described in Fig. 1. First, the prover generates a nonce γ, and computes the signature σ with his own key. This signature is sent to V encrypted with the public key of V . Upon reception, the verifier decrypts the message and checks the signature. Then, the verifier sends a nonce m, and starts the rapid phase during which he sends a challenge c to the prover. The protocol ends successfully if the answer given by the prover is correct and arrived before a predefined threshold.

#### **2.2 Attacks on Distance Bounding Protocols**

Typically, an attack occurs when a verifier is deceived into believing it is colocated with a given prover whereas it is not. Attacker may replay, relay and build new messages, as well as predict some timed challenges. Since the introduction of distance bounding protocols, various kinds of attacks have emerged, e.g. distance fraud, mafia fraud, distance hijacking attack, . . . For instance, a distance fraud only consider a dishonest prover who tries to authenticate remotely, whereas a distance hijacking scenario allows the dishonest prover to take advantage of honest agents in the neighbourhood of the verifier.

The TREAD protocol is vulnerable to a mafia fraud attack: an honest verifier v may end successfully a session with an honest prover p thinking that this prover p is in his vicinity whereas p is actually far away. The attack is described in Fig. 1. After learning γ and a signature σ = sign(γ, skp), the malicious agent i will be able to impersonate p. At the end, the verifier v will finish his session correctly thinking that he is playing with p (who is actually far away).

#### **2.3 Symbolic Security Analysis**

The first symbolic framework developed to analyse distance bounding protocols is probably the one proposed in [27]. Since then, several formal symbolic models have been proposed: *e.g.* a model based on multiset rewriting rules has been proposed in [5], another one based on strand spaces is available in [31]. However, these models do not come with a procedure allowing one to analyse distance bounding protocols in an automatic way. Recently, some attempts have been done to rely on existing automatic verification tools, e.g. ProVerif [13,19] or Tamarin [26]. Those tools typically consider an unbounded number of sessions, and some approximations are therefore performed to tackle this problem wellknown to be undecidable [21].

Here, following the long line of research on symbolic verification for a bounded number of sessions which is a problem well-known to be decidable [29,32] and for which automatic verification tools have been developed (e.g. OFMC [6], Akiss [12]), we aim to extend this approach to distance bounding protocols.

### **3 A Security Model Dealing with Time and Location**

We assume that our cryptographic protocols are modelled using a simple process calculus sharing some similarities with the applied-pi calculus [1], and strongly inspired by the calculus introduced in [19].

#### **3.1 Term Algebra**

As usual in symbolic models, we represent messages using a term algebra. We consider a set N of *names* split into two disjoint sets: the set Npub of *public names* which contains the set A of agent names, and the set Npriv of *private names*. We consider the set X of *message variables*, denoted x, y, . . ., as well as a set W of *handles*: W = {w1,w2,...}. Variables in X model arbitrary data expected by the protocol, while variables in W are used to store messages learnt by the attacker. Given a *signature* Σ, i.e. a finite set of function symbols together with their arity, and a set of atomic data At, we denote T (Σ, At) the set of terms built from At using function symbols in Σ. Given a term u, we denote *st*(u) the set of the subterms occurring in u, and *vars*(u) the set of variables occurring in u. A term u is ground when *vars*(u) = ∅. Then, we associate an *equational theory* E to the signature Σ which consists of a finite set of equations of the form u = v with u, v ∈ T (Σ, X ), and induces an equivalence relation over terms denoted =E.

*Example 1.* Σex = {aenc, adec, pk,sign, getmsg, check, ok, , proj1, proj2, h} allows us to model the cryptographic primitives used in the TREAD protocol presented in Sect. 2. The function symbols aenc and adec of arity 2 model asymmetric encryption, whereas sign, getmsg, check, and ok are used to model signature. The term pk(sk) represents the public key associated to the private key sk. We have function symbols to model pairs and projections, as well as a function h of arity 3 to model hashes. The equational theory Eex associated to the signature Σex is the relation induced by:

$$\begin{array}{ll} \mathsf{check}(\mathsf{sign}(x,y),\mathsf{pk}(y)) = \mathsf{ok} & \mathsf{proj}\_1(\langle x,y\rangle) = x & \mathsf{adeec}(\mathsf{aenc}(x,\mathsf{pk}(y)),y) = x\\ \mathsf{getms}(\mathsf{sign}(x,y)) = x & \mathsf{proj}\_2(\langle x,y\rangle) = y \end{array}$$

We consider equational theories that can be represented by a *convergent rewrite system*, i.e. we assume that there is a *confluent* and *terminating* rewrite system such that:

$$u = \pm v \iff u \downarrow = v \downarrow \text{for any terms } u \text{ and } v$$

where t↓ denotes the normal form of t. Moreover, we assume that such a rewrite system has the *finite variant property* as introduced in [16]. This means that given a sequence t1,...,t<sup>n</sup> of terms, it is possible to compute a finite set of substitutions, denoted variants(t1,...,tn), such that for any substitution ω, there exist σ ∈ variants(t1,...,tn) and τ such that: t1ω↓,...,tnω↓ = (t1σ)↓τ,...,(tnσ)↓τ . Many equational theories enjoy this property, e.g. symmetric/asymmetric encryptions, signatures and blind signatures, as well as zeroknowledge proofs.

Moreover, this finite variant property implies the existence of a finite and *complete set of unifiers* and gives us a way to compute it effectively. Given a set U of equations between terms, a *unifier* (modulo a rewrite system R) is a substitution σ such that sσ↓ = s σ↓ for any equation s = s in U. A set S of unifiers is said to be *complete* for U if for any unifier σ, there exists θ ∈ S and τ such that σ = τ ◦ θ. We denote csuR(U) such a set. We will rely on these notions of variants and csu in our procedure (see Sect. 4).

*Example 2.* The finite variant property is satisfied by the rewrite system Rex obtained by orienting from left to right equations in Eex.

Let U = {check(tσ, pk(skp)) = ok} with t<sup>σ</sup> = proj2(adec(x, skv)). We have that {θ} with θ = {x → aenc(x1,sign(x2, skp), pk(skv))} is a complete set of unifiers for U (modulo Rex). Now, considering the variants, let σ<sup>1</sup> = {x → aenc(x1, pk(skv))}, σ<sup>2</sup> = {x → aenc(x1, x2, pk(skv))} and *id* be the identity substitution, we have that {*id*, σ1, σ2} is a finite and complete set of variants (modulo Rex) for the sequence (x, tσ).

An attacker builds her own messages by applying function symbols to terms she already knows and which are available through variables in W. Formally, a computation done by the attacker is a *recipe*, i.e. a term in <sup>T</sup> (Σ, W∪Npub∪R<sup>+</sup>).

#### **3.2 Timing Constraints**

To model time, we will use non-negative real numbers R<sup>+</sup>, and we may allow various operations (e.g. +, −, ×, . . . ). A time expression is constructed inductively by applying arithmetic symbols to time expressions starting with the initial set <sup>R</sup><sup>+</sup> and an infinite set <sup>Z</sup> of *time variables*. Then, a timing constraint is typically of the form t<sup>1</sup> ∼ t<sup>2</sup> with ∼∈ {<, ≤, =}. We do not constraint the operators since our procedure is generic in this respect provided we have a way to decide whether a set of timing constraints is satisfiable or not. In practice, our tool (see Sect. 7) will only be able to consider simple linear timing constraints.

*Example 3.* When modelling distance bounding protocols, we will typically consider a timing constraint of the form <sup>z</sup>2−z<sup>1</sup> < t with <sup>z</sup>1, z<sup>2</sup> ∈ Z and <sup>t</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>. This constraint expresses that the time elapsed between the emission of a challenge and the receipt of the corresponding answer is at most t.

#### **3.3 Process Algebra**

We assume that cryptographic protocols are modelled using a simple process algebra. Following [12], we only consider a minimalistic core calculus. In particular, we do not introduce the new operator and we do not explicitly model the parallel operator. Since we only consider a bounded number of sessions (i.e. a calculus with no replication), this is at no loss of expressivity. We can simply assume that fresh names are generated from the beginning and parallel composition can be added as syntactic sugar to denote the set of all interleavings.

**Syntax.** We model a protocol as a finite set of traces. A *trace* T is a finite sequence (possibly empty and denoted in this case) of pairs, i.e. T = (a1, a1).....(an, an) where each a<sup>i</sup> ∈ A, and a<sup>i</sup> is an action of the form:

$$\begin{array}{ccc} \mathsf{out}^z(u) & \mathsf{in}^z(x) & [v = v'] & [z := v] & [t\_1 \sim t\_2] \end{array}$$

with <sup>x</sup> ∈ X , u, v, v ∈ T (Σ, N ∪R<sup>+</sup> ∪ X ), <sup>z</sup> ∈ Z, and <sup>t</sup><sup>1</sup> <sup>∼</sup> <sup>t</sup><sup>2</sup> a timing constraint.

As usual, we have output and input actions. An input action acts as a binding construct for both x and z, whereas an output action acts as a binding construct for z only. For sake of clarity, we will omit the time variable z when we do not care of the precise time at which the input (resp. output) action has been performed. As usual, our calculus allows one to perform some tests on received messages, and it is also possible to extract a timestamp from a received message and perform some tests on this extracted value using timing constraints. Typically, this will allow us to model an agent that will stop executing the protocol in case an answer arrives too late.

We assume the usual definitions of *free* and *bound variables* for traces, and we assume that each variable is at most bound once. Note that, in the constructs presented above, the variables z, x are bound. Given a set V of variables, a trace is *locally closed w.r.t.* V if for any agent a, the trace obtained by considering actions executed by agent a does not contain free variables among those in V. Such an assumption, sometimes called origination [6,15], is always satisfied when considering traces obtained by interleaving actions of a protocol. Therefore, we will only consider traces that are locally closed w.r.t. both X and Z.

Contrary to the calculus introduced in [19] which assumes that there is at most one timer per thread, we are more flexible. This generalisation is not mandatory to analyse our case studies but it allows us to present our result on traces and greatly simplifies the theoretical development.

*Example 4.* Following our syntax, the trace corresponding to the role of the verifier played by v with p is modelled as follows:

$$\begin{array}{l} T\_{\mathbf{ex}} = (v, \mathsf{in}(x)). \ (v, [\mathsf{check}(t\_{\sigma}, \mathsf{pk}(sk\_{p})) = \mathsf{ok}]). \ (v, [t\_{\gamma} = \mathsf{getmsg}(t\_{\sigma})]). \\ (v, \mathsf{out}(m)). \\ (v, \mathsf{out}^{z\_{1}}(c)). \ (v, \mathsf{in}^{z\_{2}}(y)). \ (v, [y = \mathsf{h}(c, m, t\_{\gamma})]). \ (v, [z\_{2} - z\_{1} < 2 \times t\_{0}]) \end{array}$$

where t<sup>γ</sup> = proj1(adec(x, skv)), t<sup>σ</sup> = proj2(adec(x, skv)), x, y ∈ X , z1, z<sup>2</sup> ∈ Z, m, c, skv, sk<sup>p</sup> ∈ Npriv, and <sup>t</sup><sup>0</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> is a fixed threshold.

Of course, when performing a security analysis, other traces have to be considered. Typically, we may want to consider several instances of each role, and we will have to generate traces corresponding to all the possible interleavings of the actions composing these roles.

**Semantics.** The semantics of a trace is given in terms of a labeled transition system over configurations of the form (T;Φ;t), and is parametrised by a topology reflecting the fact that interactions between agents depend on their location.

**Definition 1.** *A topology is a tuple* T<sup>0</sup> = (A0,M0, Loc0) *where* A<sup>0</sup> ⊆ A *is the finite set of agents composing the system,* M<sup>0</sup> ⊆ A<sup>0</sup> *represents those that are malicious, and* Loc<sup>0</sup> : <sup>A</sup><sup>0</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> *defines the position of each agent in the space.*

In our model, the distance between two agents is given by the time it takes for a message to travel from one to another. We have that:

$$\mathsf{Dist}\_{T\_0}(a,b) = \frac{||\mathsf{Loc}\_0(a) - \mathsf{Loc}\_0(b)||}{c\_0} \text{ for any } a, b \in \mathcal{A}\_0$$

with · : <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup> the Euclidean norm and <sup>c</sup><sup>0</sup> the transmission speed. We suppose, from now on, that c<sup>0</sup> is a constant for all agents, and thus an agent a can recover, at time t + Dist<sup>T</sup><sup>0</sup> (a, b), any message emitted by the agent b before <sup>t</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>.

**Definition 2.** *Given a topology* T<sup>0</sup> = (A0,M0, Loc0)*, a* configuration *over* T<sup>0</sup> *is a tuple* (T;Φ;t) *where* T *is a trace locally closed w.r.t.* X *and* Z *composed of actions* (a, <sup>a</sup>) *with* <sup>a</sup> ∈ A0*,* <sup>t</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>*, and* <sup>Φ</sup> <sup>=</sup> {w<sup>1</sup> a1,t<sup>1</sup> −−−→ u1,...,w<sup>n</sup> a*n*,t*<sup>n</sup>* −−−→ un} *is an* extended frame*, i.e. a substitution such that* <sup>w</sup><sup>i</sup> ∈ W*,* <sup>u</sup><sup>i</sup> ∈ T (Σ, N ∪ <sup>R</sup><sup>+</sup>)*,* <sup>a</sup><sup>i</sup> ∈ A<sup>0</sup> *and* <sup>t</sup><sup>i</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> *for* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>*.*

Intuitively, T represents the trace that still remains to be executed; Φ represents the messages that have been outputted so far; and t is the global time.

*Example 5.* Continuing Example 4, we consider the topology T<sup>0</sup> = (A0,M0, Loc0) depicted on the right where A<sup>0</sup> = {p, v, i}, and M<sup>0</sup> = {i}.

The precise location of each agent is not relevant, only the distance between them matters. Here DistT<sup>0</sup> (v, i) < t<sup>0</sup> whereas DistT<sup>0</sup> (v, p) ≥ t0.

A possible configuration is K<sup>0</sup> = (Tex;Φ0; 0) with

$$\Phi\_0 = \{ \mathbf{w}\_1 \xrightarrow{i,0} \mathbf{pk}(sk\_v), \ \mathbf{w}\_2 \xrightarrow{i,0} sk\_i, \ \mathbf{w}\_3 \xrightarrow{p,0} \mathbf{aenc}(\langle \gamma, \mathbf{sign}(\gamma, sk\_p) \rangle, \ \mathbf{pk}(sk\_i)) \}.$$

We have that v is playing the verifier's role with p (who is far away). We do not consider any prover's role but we assume that p (acting as a prover) has started a session with i and thus the corresponding encryption (here γ ∈ Npriv) has been added to the knowledge of the attacker (handle w3). We also assume that sk<sup>i</sup> ∈ Npriv, the private key of the agent i ∈ M0, is known by the attacker. A more realistic configuration would include other instances of the prover and the verifier roles and will probably give more knowledge to the attacker. This simple configuration is actually sufficient to retrieve the attack presented in Sect. 2.2. We write Φ <sup>t</sup> <sup>a</sup> for the restriction of Φ to the agent a at time t, i.e.:

$$\left[\Phi\right]\_a^t = \left\{\mathbf{w}\_i \xrightarrow{a\_i, t\_i} u\_i \mid (\mathbf{w}\_i \xrightarrow{a\_i, t\_i} u\_i) \in \Phi \text{ and } a\_i = a \text{ and } t\_i \le t\right\}.$$

Our labeled transition system is given in Fig. 2 and relies on labels which can be either equal to the unobservable τ action or of the form (a, a) with a ∈ A, and <sup>a</sup> ∈ {test, eq}∪{in(u), out(u) <sup>|</sup> <sup>u</sup> ∈ T (Σ, N ∪ <sup>R</sup><sup>+</sup>)}∪{let(v) <sup>|</sup> <sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>}. The TIM rule allows time to elapse and is labeled with τ (often omitted for sake of simplicity). The OUT rule allows an output action to be executed, and the outputted term will be added to the frame. Rule EQ is used to perform some tests, and those tests are evaluated modulo the equational theory. Then, the LET rule allows us to evaluate a term that is supposed to contain a real number, and could then be used in a timing constraint through the variable z. Then, we have a rule to evaluate a timing constraint. The IN rule allows an agent a to execute an input: the received message u has been sent at time t<sup>b</sup> by an agent b who was in possession of the message at that time. In case b is a malicious agent, i.e. b ∈ M0, the message u may have been forged through a recipe R, and b has to be in possession of all the necessary information at that time. The variable z is used to store the time at which this action has been executed.

*Example 6.* Continuing Example 5, we may consider the following execution which aims to mimic the trace developed in Sect. 2:

$$K\_0 \xrightarrow{v, \text{in}} \frac{v, \text{in}(t\_{\text{name}})}{}\_{T\_0} \xrightarrow{v, \text{eq}} \tau\_0 \xrightarrow{v, \text{eq}} \tau\_0 \xrightarrow{v, \text{out}(m)} \tau\_0$$

The first arrow corresponds to an application of the rule TIM with delay δ<sup>0</sup> ≥ Dist<sup>T</sup><sup>0</sup> (p, i) + Dist<sup>T</sup><sup>0</sup> (i, v). Then, the IN rule is triggered considering that the message taenc = aenc(γ,sign(γ, skp), pk(skv)) is sent by i at time t<sup>i</sup> such that Dist<sup>T</sup><sup>0</sup> (p, i) ≤ t<sup>i</sup> ≤ δ<sup>0</sup> − Dist<sup>T</sup><sup>0</sup> (i, v). Such a message taenc can indeed be forged by i at time t<sup>i</sup> (using recipe R = aenc(adec(w3,w2),w1)) and thus be


$$\begin{array}{ccccc}\text{TEST} & \left( \left( a, \left\| \begin{matrix} \left[ t\_1 \sim t\_2 \right] \right\| \right) T; \Phi; t \right) \xrightarrow{a, \text{test}} \text{T}\_0 & \left( T; \Phi; t \right) \\\\ \cdots & \cdots & \cdots & \cdots & a, \text{in}(u) \end{array} & \text{if } t\_1 \sim t\_2 \text{ is true} \\\\ \end{array}$$

$$\text{IN} \qquad \left( (a, \text{in}^z(x)).T; \Phi; t \right) \xrightarrow{a, \text{in}(u)} \text{T}\_0 \left( T\{x \to u, z \to t\}; \Phi; t \right)$$


**Fig. 2.** Semantics of our calculus

received by v at time δ0. Then, tests performed by v are evaluated successfully, v outputs m, and we reach the configuration Krapid = (Trapid;Φrapid; δ0) where:

$$\begin{array}{l} \vdash & T\_{\mathsf{rapid}} = (v, \mathsf{out}^{z\_1}(c)). (v, \mathsf{in}^{z\_2}(y)). (v, [y = \mathsf{h}(c, m, \gamma)]). (v, [z\_2 - z\_1 < 2t\_0]), \text{ and} \\\vdash & \Phi\_{\mathsf{rapid}} = \Phi\_0 \uplus \{\mathsf{w}\_4 \xrightarrow{v, \delta\_0} m\}. \end{array}$$

We can pursue this execution as follows:

$$\begin{split} &K\_{\mathsf{rapid}} \xrightarrow{\boldsymbol{v},\mathsf{out}(\boldsymbol{c})} \boldsymbol{\tau}\_{\boldsymbol{T}\_{0}} \xrightarrow{\boldsymbol{v},\mathsf{in}(\boldsymbol{h}(\boldsymbol{c},m,\boldsymbol{\gamma}))} \boldsymbol{\tau}\_{\boldsymbol{T}\_{0}} \xrightarrow{\boldsymbol{v},\mathsf{eq}} \boldsymbol{\tau}\_{\boldsymbol{T}\_{0}} \\ & \left( (\boldsymbol{v}, \left[ \boldsymbol{\delta}\_{0} + 2\mathsf{Dist}\_{\boldsymbol{T}\_{0}}(\boldsymbol{v}, \boldsymbol{i}) - \boldsymbol{\delta}\_{0} < 2t\_{0} \right] \right); \boldsymbol{\Phi}\_{\mathsf{tupid}} \uplus \left\{ \mathsf{w}\_{5} \xrightarrow{\boldsymbol{v},\boldsymbol{\delta}\_{0}} \boldsymbol{c} \right\}; \boldsymbol{\delta}\_{0} + 2\mathsf{Dist}\_{\boldsymbol{T}\_{0}}(\boldsymbol{v}, \boldsymbol{i}) \mathrel{\mathop{:}} \end{split}$$

The second arrow is an application of the rule TIM with delay 2Dist<sup>T</sup><sup>0</sup> (v, i) so that h(c, m, γ) can be received by v at time δ<sup>0</sup> + 2Dist<sup>T</sup><sup>0</sup> (v, i). Since Dist<sup>T</sup><sup>0</sup> (v, i) < t0, the timing constraint is true and the last action can be executed.

The goal of this paper is to propose a new procedure for analysing a bounded number of sessions of distance bounding protocols. Once the topology is fixed, the existence of an attack can be directly encoded as a reachability property considering a finite set of traces. The following sections are thus dedicated to the study of the following problem:

**Input:** A trace <sup>T</sup> locally closed w.r.t. <sup>X</sup> and <sup>Z</sup>, <sup>t</sup><sup>0</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>, and a topology <sup>T</sup>0. **Output:** Do there exist 1,..., <sup>n</sup>, <sup>Φ</sup>, and <sup>t</sup> such that (T; <sup>∅</sup>;t0) 1...,*<sup>n</sup>* −−−−→<sup>T</sup><sup>0</sup> (;Φ;t)?

### **4 Modelling Using Horn Clauses**

Following the approach developed in Akiss [12], our procedure is based on an abstract modelling of a trace in first-order Horn clauses. Our set of seed

$$\begin{array}{llll}\text{NOT} & \{(a,\mathsf{out}^{\pm}(u)).T;\phi\right) \stackrel{a,\mathsf{out}(\u u)}{\longrightarrow} (T;\phi\uquotedeq{in}\{\mathsf{w}\to u\}) & \text{with } \mathsf{w}\in\mathcal{W}\text{ fresh} \\\text{EQ} & \{(a,[u=v]).T;\phi\} \stackrel{a,\mathsf{eq}}{\dashrightarrow} (T;\phi) & \text{if } u\downarrow = v\downarrow \\\text{MET} & \{(a,[z:=v].T;\phi)\stackrel{a,\mathsf{Int}(v)}{\dashrightarrow} (T;\phi) & \\\text{TEST} & \{(a,[t\_{1}\sim t\_{2}]).T;\phi\} \stackrel{a,\mathsf{test}}{\dashrightarrow} (T;\phi) & \\\text{IN} & \{(a,\mathsf{in}^{\star}(x)).T;\phi\} \stackrel{a,\mathsf{in}(u)}{\dashrightarrow} (T\{x\to u\};\phi) & \text{if } u=R\phi\downarrow \text{ for some recipe } R. \end{array}$$

#### **Fig. 3.** Relaxed semantics

statements is more in line with what has been implemented in Akiss for optimisation purposes rather than what is presented in [12].

#### **4.1 Preliminaries**

We consider *symbolic runs* which are finite sequences of pairs with possibly a *run variable* typically denoted y at its ends. We have that each pair (a, a) is such that <sup>a</sup> ∈ A and <sup>a</sup> is an action of the form (with <sup>u</sup> ∈ T (Σ, N ∪ <sup>R</sup><sup>+</sup> ∪ X )):

$$\mathsf{out}(u) \qquad \mathsf{in}(u) \qquad \mathsf{eq} \qquad \mathsf{test} \qquad \mathsf{let}(u).$$

Excluding the special variable y, a symbolic run (a1, a1).....(an, an), only contains variables from the set X . We say that it is *locally closed* if whenever a variable x occurs in an output action (resp. let action) a<sup>j</sup> , then there exists an input action a<sup>i</sup> occurring before (i.e. i<j) such that a<sup>i</sup> = a<sup>j</sup> and x ∈ *vars*(ai). Symbolic runs are often denoted w, w ,..., and we write w w when the sequence w is a prefix of w . Given a symbolic run w<sup>0</sup> whose sequence of outputs is out(u1) · ... · out(un), we denote φ(w0) = {w<sup>1</sup> → u1,...,w<sup>n</sup> → un}.

We also consider *symbolic recipes* which are terms in T (Σ, W∪Npub ∪ Y) where Y is a set of recipe variables disjoint from X and W. We use capital letters X, Y , and Z to range over Y.

*Example 7.* We consider the following symbolic run:

$$w\_0 = (v, \text{in}(\mathsf{aenc}(\langle x', \mathsf{sign}(x', sk\_p)\rangle, \mathsf{pk}(sk\_v)))). (v, \mathsf{eq}). (v, \mathsf{eq}).$$

$$(v, \mathsf{out}(m)). (v, \mathsf{out}(c)). (v, \mathsf{in}(\mathsf{h}(c, m, x'))). (v, \mathsf{eq})$$

We have that φ(w0) = {w<sup>1</sup> → m,w<sup>2</sup> → c}.

Our logic is based on two predicates expressing deduction and reachability without taking into account timing constraints. More formally, given a configuration (T;Φ;t), its untimed counterpart is (T; φ) where φ is the untimed counterpart of Φ, i.e. a frame of the form: φ = {w<sup>1</sup> → u1,...,w<sup>n</sup> → un}. The relaxed semantics over untimed configurations is given in Fig. 3. Since time variables (from Z) are not instantiated during a relaxed execution, in an untimed configuration (T; φ), the trace T is only locally closed w.r.t. X . Our predicates are:


Formally, we have that:


This semantics is extended as usual to first-order formulas built using the usual connectives (e.g. conjunction, quantification, ...)

*Example 8.* The frame φ<sup>0</sup> below is the untimed counterpart of Φ0:

$$\phi\_0 = \{\mathfrak{w}\_1 \to \mathfrak{pk}(sk\_v), \,\mathfrak{w}\_2 \to sk\_i, \,\,\mathfrak{w}\_3 \to \mathsf{aenc}(\langle \gamma, \mathsf{sign}(\gamma, sk\_p) \rangle, \,\mathfrak{pk}(sk\_i))\}.$$

We have that (Tex; φ0) tr (; φfinal) where φfinal is the untimed counterpart of Φfinal = Φrapid {w<sup>5</sup> v,δ<sup>0</sup> −−→ c}, and tr is the same sequence of labels as the one developed in Example 6, i.e.

(v, in(taenc))(v, eq)(v, eq)(v, out(m))(v, out(c))(v, in(h(c, m, γ)))(v, eq)(v,test).

#### **4.2 Seed Statements**

We consider particular Horn clauses which we call *statements*.

**Definition 3.** *A* statement *is a Horn clause:* H ⇐ k<sup>w</sup><sup>1</sup> (X1, u1),..., k<sup>w</sup>*<sup>n</sup>* (Xn, un) *with* H ∈ {r<sup>w</sup><sup>0</sup> , k<sup>w</sup><sup>0</sup> (R, u)} *and such that:*


*When* H = k<sup>w</sup><sup>0</sup> (R, u)*, we assume in addition that vars*(u) ⊆ *vars*(u1,...,un) *and* R({X<sup>i</sup> → ui} φ(w0))↓ = u*.*

In the above definition, we implicitly assume that all variables are universally quantified, i.e., all statements are ground. By abuse of language we sometimes call σ a grounding substitution for a statement H ⇐ (B1,...,Bn) when σ is grounding for each of the atomic formulas H, B1,...,Bn. The *skeleton* of a statement f, denoted skl(f), is the statement where recipes are removed.


**Fig. 4.** Seed statements seed(T, C)

Our definition of statement is in line with the original one proposed in [12] but we state an additional invariant used to establish the completeness of our procedure.

In order to define our set of seed statements, we have to fix some naming conventions. Given a trace T of the form (a1, a1).(a2, a2).....(an, an), we assume w.l.o.g. the following naming conventions:


For each m ∈ {0,...,n}, the sets Rcv(m), Snd(m), Eq(m), Let(m), and Test(m) respectively denote the set of indexes of the receive, send, equality, let, and test actions amongst a1,..., am. We denote by |S| the cardinality of S.

Given a set C⊆Npub <sup>∪</sup> <sup>R</sup><sup>+</sup>, the set of *seed statements* associated to <sup>T</sup> and <sup>C</sup>, denoted seed(T, <sup>C</sup>), is defined in Fig. 4. If <sup>C</sup> <sup>=</sup> <sup>N</sup>pub <sup>∪</sup> <sup>R</sup><sup>+</sup>, then seed(T, <sup>C</sup>) is said to be the set of seed statements associated to T and in this case we write seed(T) as a shortcut for seed(T, <sup>N</sup>pub <sup>∪</sup> <sup>R</sup><sup>+</sup>). When computing seed statements, we compute complete sets of unifiers and complete sets of variants modulo R. This allows us to get rid of the rewrite system in the remainder of our procedure and then only consider unification modulo the empty equational theory. In this case, it is well-known that (when it exists) csu∅(U) is uniquely defined up to some variable renaming, and we write mgu(u1, u2) instead of csu∅({u<sup>1</sup> = u2}).

*Example 9.* Let T <sup>+</sup> ex = T<sup>0</sup> · Tex with T<sup>0</sup> = (i, out(pk(skv))).(i, out(ski)).(p, out(u)) and <sup>u</sup> <sup>=</sup> aenc(γ,sign(γ, skp), pk(ski)). The set seed(<sup>T</sup> <sup>+</sup> ex , ∅) contains among others the statement f1, f2, f3, and f<sup>4</sup> given below:

rT0·w0·(v,test) ⇐ kT<sup>0</sup> (X1, aenc(x ,sign(x , skp), pk(skv))), kT0·w<sup>5</sup> <sup>0</sup> (X2, h(c, m, x )); kT0·y(w3, u) ⇐ ; ky(adec(Y1, Y2), adec(y1, y2)) ⇐ ky(Y1, y1), ky(Y2, y2); and its variant ky(adec(Y1, Y2), y3) ⇐ ky(Y1, aenc(y3, pk(y2))), ky(Y2, y2)

where w<sup>0</sup> is given in Example 7, and w<sup>5</sup> <sup>0</sup> is the prefix of w<sup>0</sup> of size 5.

Statement f<sup>1</sup> expresses that the trace is executable (in the relaxed semantics) as soon as we are able to deduce the two terms requested in input, f<sup>2</sup> says that the attacker knows the term u as soon as T<sup>0</sup> has been executed. The two remaining statements model the fact that an attacker can apply the decryption algorithm on any terms he knows (statement f3), and this will give him access to the plaintext when the right key is used (statement f4).

#### **4.3 Soundness and Completeness**

We now show that as far as the timing constraints are ignored, the set seed(T) is a sound and complete abstraction of a trace. Moreover, we have to ensure that the proof tree witnessing the existence of a given predicate in H(seed(T)) matches with the relaxed execution we have considered. This is mandatory to establish the completeness of our procedure.

**Definition 4.** *Given a set* K *of statements,* H(K) *is the smallest set of ground facts such that:*

$$f = \left(H \Leftarrow B\_1, \ldots, B\_n\right) \in K \qquad B\_1\sigma \in \mathcal{H}(K), \ldots, \; B\_n\sigma \in \mathcal{H}(K)$$

$$\text{Consitives}. \quad \begin{array}{l} \sigma \text{  $g$ } \text{rounding for } f \qquad \text{skl}(f\sigma) \text{ in } normal \text{ form} \\\hline H\sigma \in \mathcal{H}(K) \end{array}$$

*Let* B<sup>i</sup> = k<sup>w</sup>*<sup>i</sup>* (Xi, ui) *for* i ∈ {1,...,n}*, and* w<sup>0</sup> *the world associated to* H *with* v1,...,v<sup>k</sup> *the terms occurring in input in* w0*. We say that such an instance of* Conseq matches with exec = (T; <sup>∅</sup>) 1,...,*<sup>p</sup>* (S; φ) *using* R1,...,R<sup>k</sup> *as input recipes if* w0σ 1,..., <sup>p</sup>*, and there exist* Rˆ1,..., Rˆ<sup>k</sup>*such that:*

*–* <sup>R</sup>ˆ<sup>j</sup> ({X<sup>i</sup> <sup>→</sup> <sup>u</sup><sup>i</sup> <sup>|</sup> <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>} <sup>φ</sup>(w0))<sup>↓</sup> <sup>=</sup> <sup>v</sup><sup>j</sup> *for* <sup>j</sup> ∈ {1,...,k }*; and –* <sup>R</sup>ˆj<sup>σ</sup> <sup>=</sup> <sup>R</sup><sup>i</sup> *for* <sup>j</sup> ∈ {1,...,k }*.*

This notion of matching is extended to a proof tree π as expected, meaning that all the instances of Conseq used in π satisfy the property.

Actually, the completeness of our procedure will be established w.r.t. a subset of recipes, namely *uniform recipes*. We establish that an execution of a trace T<sup>0</sup> which only involves uniform recipes has a counterpart in H(seed(T0)) which is uniform too.

**Definition 5.** *Given a frame* φ*, a recipe* R *is* uniform w.r.t. φ *if for any* R1, R<sup>2</sup> ∈ *st*(R) *such that* R1φ↓ = R2φ↓*, we have that* R<sup>1</sup> = R2*.*

*Given a set* K *of statements, we say that a set* {π1,...,πn} *of proof trees in* H(K) *is* uniform *if for any* kw(R1, t) *and* kw(R2, t) *that occur in* {π1,...,πn}*, we have that* R<sup>1</sup> = R2*.*

We are now able to state our soundness and completeness result.

**Theorem 1.** *Let* T<sup>0</sup> *be a trace locally closed w.r.t.* X *.*

*–* (T0; ∅) |= g *for any* g ∈ seed(T0) ∪ H(seed(T0))*;*

	- *1.* r1,...,*<sup>p</sup>* ∈ H(seed(T0))*; and*
	- *2. if* Rφ↓ = u *for some recipe* R *uniform w.r.t.* φ *then* k1,...,*<sup>p</sup>* (R, u) ∈ H(seed(T0))*.*

*Moreover, we may assume that the proof tree witnessing these facts are uniform and match with* exec *using* R1,...,R<sup>k</sup> *as input recipes.*

### **5 Saturation**

At a high level, our procedure consists of two steps:


#### **5.1 Saturation Procedure**

We start by describing our saturation procedure. It manipulates a set of statements called a *knowledge base*.

**Definition 6.** *Given a statement* f = (H ⇐ B1,...,Bn)*,*


*A set of* well-formed *statements is called a* knowledge base*. If* K *is a knowledge base,* solved(K) = {f ∈ K | f is solved}*.*

We restrict the use of the resolution rule and we only apply it on a selected atom. To formalise this, we assume a selection function sel which returns ⊥ when applied on a solved statement, and an atom kw(X, t) with t ∈ X when applied on an unsolved statement. Resolution must be performed on this selected atom.

$$\begin{aligned} f: H &\Leftarrow \mathbb{k}\_w(X, t), B\_1, \dots, B\_n \in K \text{ such that } \mathbb{k}\_w(X, t) = \text{sel}(f) \\ \text{Res} &\frac{g: \mathbb{k}\_{w'}(R', t') \in B\_{n+1}, \dots, B\_m \in \text{solved}(K) \quad \sigma = \text{mgu}(\mathbb{k}\_w(X, t), \mathbb{k}\_{w'}(R', t'))}{h\sigma} \end{aligned}$$

*Example 10.* Applying resolution between f<sup>4</sup> and f<sup>2</sup> (see Example 9), we obtain:

$$\mathsf{k}\_{T\_0 \cdot \mathbf{y}}(\mathsf{adec}(\mathsf{w}\_3, Y\_2), \langle \gamma, \mathsf{sign}(\gamma, sk\_p) \rangle) \Leftarrow \mathsf{k}\_{T\_0 \cdot \mathbf{y}}(Y\_2, sk\_i).$$

Then, we will derive kT0·y(adec(w3,w2),γ,sign(γ, skp)) ⇐ and this solved statement (with others) will be used to perform resolution on f<sup>1</sup> leading (after several resolution steps) to the statement:

$$\mathbb{T}\_{T\_0 \cdot w \cdot \{v, \textbf{test}\}} \Leftarrow \mathbb{k}\_{T\_0}(X\_1', x'),\\\mathbb{k}\_{T\_0}(X\_2', \textbf{sign}(x', sk\_p)),\\\mathbb{k}\_{T\_0 \cdot w\_0^5}(X\_3', x')$$

$$\text{Ultimatedly, we will derive } \mathbb{r}\_{T\_0 \cdot w\_0 \sigma' \cdot \{v, \textbf{test}\}} \Leftarrow \quad \text{with } \sigma' = \{x' \to \gamma\}.$$

During saturation, the statement obtained by resolution is given to an update function which decides whether it has to be added or not into the knowledge base (possibly after some transformations). In original Akiss, many deduction statements are discarded during the saturation procedure. This is useful to avoid non-termination issues and it is not a problem since there is no need to derive the same term (from the deduction point of view) in more than one way. Now, considering that messages need time to reach a destination, a same message emitted twice at two different locations deserves more attention.

*Example 11.* Let T = (a1, out(k)).(a2, out(k)).(b, in<sup>z</sup>(x)).(b, [x = k]).(b, z < 2), and T<sup>0</sup> be a topology such that Dist<sup>T</sup><sup>0</sup> (a1, b) = 10 while Dist<sup>T</sup><sup>0</sup> (a2, b) = 1. The configuration (T; ∅; 0) is executable but only considering w<sup>2</sup> as an input recipe for x. The recipe w<sup>1</sup> that produces the exact same term k is not an option (even if it is outputted before w2) since the agent a<sup>1</sup> who outputs it is far away from b.

Whereas the original Akiss procedure will typically discard the statement k(w2, k) ⇐ (by replacing it with an identical statement), we will keep it.

As illustrated by Example 11, we therefore need to consider more recipes (even if they deduce the same message) to accommodate timing constraints, but we have to do this in a way that does not break termination (in practice). To tackle this issue, we modified the canonicalization rule, as well as the update function to allow more deduction statements to be added in the knowledge base.

**Definition 7.** *The canonical form* f⇓ *of a statement* f = (H ⇐ B1,...,Bn) *is the statement obtained by applying the* Remove *rule given below as many times as possible.*

$$\text{REMOve} \xrightarrow{H \Leftarrow \&\_{w} (X, t), \; \Bbbk\_{w} (Y, t), \; B\_{1}, \dots, B\_{n} \; with \; X \notin \text{vars}(H)}$$

$$H \Leftarrow \&\_{w} (Y, t), \; B\_{1}, \dots, B\_{n}$$

The intuition is that there is no need to consider several recipes (here X and Y ) to deduce the same term t when such a recipe does not occur in the head of the statement.

Then, the update of K by f denoted K - {f}, is defined to be K if either skl(f⇓) is not in normal form; or f⇓ is solved but not well-formed. Otherwise, K-{f} = K∪{f⇓}. To initiate our saturation procedure, we start with the initial knowledge base Kinit(S) associated to a set S of statements (typically seed(T, C) for some well-chosen C). Given a set S of statements, the initial knowledge base associated to S, denoted Kinit(S), is defined to be the empty knowledge base updated by the set S, i.e. Kinit(S) = (((∅ f1) f2) - ...f<sup>n</sup> where f1,...,f<sup>n</sup> is an enumeration of the statements in S. In return, the saturation procedure produces a set sat(K) which is actually a knowledge base.

Then, we can establish the soundness of our saturation procedure. This is relatively straightforward and follows the same lines as the original proof.

**Proposition 1.** *Let* T<sup>0</sup> *be a trace locally closed w.r.t.* X *,* K = sat(Kinit(T0))*. We have that* (T0; ∅) |= g *for any* g ∈ solved(K) ∪ H(solved(K))*.*

#### **5.2 Completeness**

Completeness is more involved. Indeed, we can not expect to retrieve all the recipes associated to a given term. To ensure termination (in practice) of our procedure, we discard some statements when updating the knowledge base, and we have to justify that those statements are indeed useless. Actually, we show that considering uniform recipes is sufficient when looking for an attack trace.

However, the notion of uniform recipe does not allow one to do the proof by induction. We therefore consider a more restricted notion that we call asap recipes. The idea is to deduce a term as soon as possible but this may depend on the agent who is performing the computation. We also rely on an ordering relation which is independent of the agent who is performing the computation, and which is compatible with our notion of asap w.r.t. any agent.

Given a relaxed execution exec = (T; <sup>∅</sup>) 1,...,*<sup>n</sup>* (S; φ) with input recipes R1,...,Rk, we define the following relations:


Then, <exec is the smallest transitive relation over recipes built on *dom*(φ) that contains <in exec and <sub exec. As usual, we denote ≤exec the reflexive closure of <exec.

Given a timed execution exec = (T0; <sup>∅</sup>;t0) 1,...,*<sup>n</sup>* −−−−−→ (S;Φ;t) with Φ = {w<sup>1</sup> a1,t<sup>1</sup> −−−→ u1,...,w<sup>n</sup> a*n*,t*<sup>n</sup>* −−−→ un}, we denote by agent(wi) (resp. time(wi)) the agent a<sup>i</sup> (resp. the time ti). The relation <<sup>a</sup> exec over *dom*(Φ) × *dom*(Φ) with <sup>a</sup> ∈ A is defined as follows: <sup>w</sup> <sup>&</sup>lt;<sup>a</sup> exec w when:


This order is extended on recipes as follows: R <<sup>a</sup> exec R when:


We have that <<sup>a</sup> exec is a well-founded order for any a ∈ A which is compatible with <exec, i.e. R <exec R implies R <<sup>a</sup> exec R for any agent a.

We are now able to introduce our notion of asap recipe.

**Definition 8.** *Let* <sup>T</sup> = (A,M, Loc) *be a topology, and* exec = (T0; <sup>∅</sup>;t0) 1,...,*<sup>n</sup>* −−−−−→ (S;Φ;t) *be an execution. A recipe* R *is* asap w.r.t. a ∈ A *and* exec *if:*

*– either* <sup>R</sup> ∈ Npub <sup>∪</sup> <sup>R</sup><sup>+</sup> ∪ W *and* R *such that* <sup>R</sup> <sup>&</sup>lt;exec <sup>R</sup> *and* <sup>R</sup> Φ↓ = RΦ↓*; – or* <sup>R</sup> <sup>=</sup> <sup>f</sup>(R1,...,Rk) *with* <sup>f</sup> <sup>∈</sup> <sup>Σ</sup> *and* R *such that* <sup>R</sup> <sup>&</sup>lt;<sup>a</sup> exec R *and* R Φ↓ = RΦ↓*.*

We may note that our definition of being asap takes care about honest agents who are not allowed to forge messages from their knowledge using recipes not in W∪Npub <sup>∪</sup>R<sup>+</sup>. Hence, a recipe <sup>R</sup> ∈ W is not necessarily replaced by a recipe <sup>R</sup> even if R <<sup>a</sup> exec R and R Φ↓ = RΦ↓. Actually, such a recipe R is not necessarily an alternative to R when a ∈ M0.

Then, we can establish completeness of our saturation procedure w.r.t. these asap recipes.

**Theorem 2.** *Let* <sup>K</sup> <sup>=</sup> solved(sat(Kinit(T0)))*. Let* exec = (T0; <sup>∅</sup>;t0) 1,...,*<sup>p</sup>* −−−−−→ (S;Φ;t) *be an execution with input recipes* R1,...,R<sup>k</sup> *forged by* b1,...,b<sup>k</sup> *and such that each* R<sup>j</sup> *with* j ∈ {1,...,k} *is asap w.r.t.* b<sup>j</sup> *and* exec*. We have that:*


*Proof (sketch).* We have that asap recipes are uniform and we can therefore apply Theorem 1. This allows us to obtain a proof tree in H(seed(T0)). Then, by induction on the proof tree, we lift it from H(seed(T0)) to H(K). The difficult part is when the statement obtained by resolution is not directly added in the knowledge base. It may have been modified by the rule Remove or even discarded by the update operator. In both cases, we derive a contradiction with the fact that we are considering asap recipes. *Example 12.* Considering the relaxed execution starting from (T<sup>0</sup> · Tex, ∅) by performing the three outputs followed by the untimed version of the execution described in Example 6, we reach (, φ) using recipes R<sup>1</sup> = aenc(adec(w3,w2),w1) and R<sup>2</sup> = h(w5,w4, proj1(adec(w3,w2))). Let K be the set of solved statements obtained by saturation, we have that rT0·w0σ-·(v,test) ∈ H(K) (see Example 10). Note that the symbolic run T<sup>0</sup> · w0σ · (v,test) coincides with the labels used in the execution trace. Here, the proof tree is reduced to a leaf, and choosing Rˆ<sup>1</sup> = R1, Rˆ<sup>2</sup> = R2, gives us the matching we are looking for.

### **6 Algorithm**

In this section, we first present our algorithm to verify whether a given timed configuration can be fully executed, and then discuss its correctness.

#### **6.1 Description**

Our procedure is given in Algorithm 1. We start with the set K of solved statements obtained by applying our saturation procedure on the trace T. We consider each reachability statement in K, and after instantiating the remaining variables with fresh constants using a bijection ρ, we compute for each input (ai, in(vi)) occurring in <sup>1</sup> ..., <sup>n</sup> all the possible recipes that may lead to the term viρ and store them in the set Li. Actually, thanks to our soundness result (Proposition 1), we know that these recipes deduce the requested terms, and it only remains to check that the timing constraints are satisfied (lines 10–11).

We consider a trace T of the form (a1, a1).(a2, a2).....(an, an) locally closed w.r.t. X and Z and we assume the naming convention given in Sect. 4.2. Moreover, we denote by orig(j) the index of the action in the trace T that performed


the <sup>j</sup>th output, i.e. orig(j) is the minimal <sup>k</sup> such that <sup>|</sup>Snd(k)<sup>|</sup> <sup>=</sup> <sup>j</sup>. The function Timing takes as inputs the initial configuration, the recipes used to feed the inputs occurring in the trace, and the terms corresponding to these inputs. Note that all these terms may still contain variables from Z. This function computes a formula that represents all the timing constraints that have to be satisfied to ensure the executability of the trace in our timed model. More formally, Timing((T; ∅;t0)), Ri<sup>1</sup> ...Ri*<sup>p</sup>* , ui<sup>1</sup> ...ui*<sup>p</sup>* ) is the conjunction of the formulas:

1. z<sup>1</sup> = t0, and z<sup>i</sup> ≤ zi+1 for any 1 ≤ i<n; 2. t<sup>i</sup> ∼ t <sup>i</sup> for any i ∈ Test(n) with a<sup>i</sup> = [[t<sup>i</sup> ∼ t i]]; 3. z <sup>i</sup> = vi{x<sup>j</sup> → u<sup>j</sup> | j ∈ Rcv(i)}↓ for any i ∈ Let(n); 4. For any i ∈ Rcv(n), we consider the formula:


$$\bigvee\_{b \in \mathcal{M}\_0} \left( \bigwedge\_{\{j \mid \boldsymbol{w}\_j \in vars(R\_i)\}} z\_{\text{orig}(j)} + \textsf{Dist}\_{\mathcal{T}\_0} (a\_{\text{orig}(j)}, b) \le z\_i - \textsf{Dist}\_{\mathcal{T}\_0} (b, a\_i) \right)$$

The last step of our algorithm consists in checking whether the resulting formula ψ is satisfiable or not, i.e. whether there exists a mapping from *vars*(ψ) to R<sup>+</sup> such that the formula ψ is true. Of course, even if our procedure is generic w.r.t. to timing constraints, the procedure to check the satisfiability of ψ will depend on the constraints we consider. Actually, all the formulas encountered during our case studies are quite simple: they are expressed by equations of the form z − z ≤ t, and we therefore rely on the well-known Floyd-Warshall algorithm to solve them. When needed, we may rely on the simplex algorithm to solve more general linear constraints.

#### **6.2 Termination Issues**

First, we may note that to obtain an effective saturation procedure, it is important to start with a finite set of seed statements. Our set seed(T) is infinite but as it was proved in [12], we can restrict ourselves to perform saturation using the finite set seed(T, C<sup>T</sup> ) where C<sup>T</sup> contains the public names and the real numbers occurring in the trace T. More formally, we have that:

**Lemma 1.** *Let* C<sup>T</sup> *be the finite set of public names and real numbers occurring in* <sup>T</sup>*, and* <sup>C</sup>all <sup>=</sup> <sup>N</sup>pub <sup>∪</sup> <sup>R</sup><sup>+</sup>*. We have that:*

$$\mathsf{sact}(K\_{\text{init}}(\mathsf{seed}(T,\mathcal{C}\_{all}))) = \mathsf{sact}(K\_{\text{init}}(\mathsf{seed}(T,\mathcal{C}\_T))) \cup \{\mathsf{k}\_\mathcal{V}(c,c) \leftarrow \mid \quad c \in \mathcal{C}\_{all}\}.$$

Nevertheless, the saturation may not terminate. We could probably avoid some non-termination issues by improving our update operator. However, ensuring termination in theory is a rather difficult problem (the proof of termination for the original Akiss procedure for subterm convergent theories is quite complex [12] – more than 20 pages). We would like to mention that we never encountered non-termination issues in practice on our case studies.

Another issue is that, when computing the set Li, we need to compute all the recipes R such that kw(R, u) ∈ H(K) for a given term u. This can be achieved using a simple backward search and will terminate since K only contains solved statements that are well-formed. The naive recursive algorithm will therefore consider terms u1,...,u<sup>n</sup> that are strict subterms of the initial term u. Note that statements that are not well-formed are discarded by our update operator: ensuring completeness of our saturation procedure when discarding statements that are not well-formed is the challenging part of our completeness proof.

#### **6.3 Correctness of Our Algorithm**

We consider a topology T<sup>0</sup> and a configuration (T; ∅;t0) built on top of T<sup>0</sup> and such that T is locally closed w.r.t. both X and Z.

**Theorem 3.** *Let* <sup>C</sup><sup>T</sup> ⊆ Npub <sup>R</sup><sup>+</sup> *be the finite set of public names and real numbers occurring in* T*. Let* K = solved(sat(Kinit(seed(T, C<sup>T</sup> ))))*. We have that:*

*– if* Reachability(K, t0, <sup>T</sup>0) *holds then* (T; <sup>∅</sup>;t) *is executable in* <sup>T</sup>0*;*

*– if* (T; <sup>∅</sup>;t0) *is executable in* <sup>T</sup><sup>0</sup> *then* Reachability(K, t0, <sup>T</sup>0) *holds.*

Soundness (item 1 above) is relatively straightforward. Item 2 is more involved. Of course, our algorithm does not consider all the possible recipes for inputs. Some recipes are discarded from our analysis. Actually, it is sufficient to focus our attention on asap recipes. To justify that this is not an issue regarding completeness, we first establish the following result.

**Lemma 2.** *Let* exec = K<sup>0</sup> 1,...,*<sup>n</sup>* −−−−−→<sup>T</sup><sup>0</sup> (S;Φ;t) *be an execution. We may assume w.l.o.g. that* exec *involves input recipes* R1,...,R<sup>k</sup> *forged by agents* b1,...,b<sup>k</sup> *and* R<sup>i</sup> *is asap w.r.t.* b<sup>i</sup> *and* exec *for each* i ∈ {1,...,k}*.*

Then, we may apply Theorem 2 (item 1) on this "asap execution" and deduce the existence of f = r- 1,...,- *<sup>n</sup>* ⇐ k<sup>w</sup><sup>1</sup> (X1, x1),..., k<sup>w</sup>*m*(Xm, xm) in K and a substitution σ witnessing the fact that r1,...,*<sup>n</sup>* = r- 1σ,...,- *<sup>n</sup>*<sup>σ</sup> ∈ H(K). Moreover, we know that f and σ match with exec and R1,...,Rk. Considering the symbolic recipes Rˆ1,..., Rˆ<sup>k</sup> witnessing this matching, and instantiating their variables with adequate fresh constants (using ρ), we can show that Rˆ1ρ, . . . , Rˆkρ are recipes that allow to perform the timed execution 1ρ, . . . , <sup>n</sup>ρ. Note that thanks the strong relationship we have between R1,...,R<sup>k</sup> and Rˆ1,..., Rˆ <sup>k</sup> (by definition of matching, R<sup>i</sup> = Rˆiσ), we know that the resulting timing constraints gathered in the formula ψ due to inputs are less restrictive, and the other ones are essentially unchanged. This allows us to ensure that the formula ψ will be satisfiable. Now, applying Lemma 2, we can assume w.l.o.g. that recipes involved in such a trace are asap, and thus according to Theorem 2 will be considered by our procedure, and put in L<sup>i</sup><sup>1</sup> ,...,L<sup>i</sup>*<sup>p</sup>* at line 6 of Algorithm 1.

### **7 Implementation and Case Studies**

We validate our approach by integrating our procedure in Akiss [12], and successfully used it on several case studies. All files related to the tool implementation and case studies are available at

http://people.irisa.fr/Alexandre.Debant/akiss-db.html.

#### **7.1 Integration in Akiss**

Our syntax is very close to the one presented in Sect. 3. For sake of simplicity, we sometimes omit timestamps on input/output actions. Regarding our timing constraints, our syntax only allows linear expressions of the form z<sup>1</sup> − z<sup>2</sup> ∼ z<sup>3</sup> with <sup>z</sup><sup>i</sup> ∈Z∪ <sup>R</sup><sup>+</sup> and ∼∈{=, <, ≤}. These expressions are enough to model all our case studies. To ease the specification of protocols our tool support parallel composition of traces (T<sup>1</sup> || T2). This operator is syntactic sugar and can be translated to sets of traces in a straightforward way.

To mitigate the potential exponential blowup caused by this translation, we always favour let, equality, and test actions, as well as output actions when no timestamp occur on it. The second optimisation consists in executing input actions (without timestamps) in a raw. These optimisations will allow us to reduce the number of traces that have to be considered during our analysis, and are well-known to be sound when verifying reachability properties [4,30].

*Example 13.* Let P = (a, in(x1)).(a, in(x2).(a, out(u)) || (b, in(x3)).(b, out(v)). Computing naively all the possible interleavings will give us 10 traces to analyse. The first optimisation will allow us to reduce this number to 3, and together with the second optimisation, this number falls to 2.

#### **7.2 Case Studies**

In this section we demonstrate that our tool can be effectively used to analyse distance bounding protocols and payment protocols. Our experiments have been done on a standard laptop and the results obtained confirm termination of the saturation procedure when analysing various protocols (× stands for attack, means that the protocol has been proved secure). We indicate the number of roles (running in parallel) we consider and the number of traces (due to all the possible interleaving of the roles) that have been analysed by the tool in order to conclude. Our algorithm stops as soon as an attack is found, and thus the number of possible interleavings is not relevant in this case.

We only consider two distinct topologies: one to analyse mafia fraud scenarios (2 honest agents far away with a malicious agent close to each honest agent) and one to analyse distance hijacking for which 3 agents are considered (malicious agent in the neighbourhood of the verifier on which the security property is encoded is not allowed). This may seem restrictive but it has been shown to be sufficient to capture all the possible attacks [19]. Our results are consistent with the ones obtained in [13,14,19,26].

**Distance Bounding Protocols.** As explained in Sect. 2 on the TREAD protocol, we ignore several details that are irrelevant to a security analysis performed in the symbolic model. Moreover, our procedure is not yet able to support the exclusive-or operator and thus it has been modelled in an abstract way when analysing the protocols BC and Swiss-Knife. When no attack was found for 2 roles, we consider more roles (and thus more traces). The fact that the performances degrade when considering additional roles is not surprising and is clearly correlated with the number of traces that have to be considered.

**Payment Protocols.** We have also analysed three payment protocols (and some of their variants) w.r.t. mafia fraud – the only relevant scenario for this kind of application (see [13]). It happens that these protocols are more complex to analyse than traditional distance bounding protocols. They often involve more complex messages, and a larger number of message exchanges. Moreover, in protocols MasterCard RRP and NXP, the threshold is not fixed in advance but received during the protocol execution. Due to this, these protocols fall outside the class of protocols that can be analysed by [19,26]. To our knowledge only [13] copes with this issue by proposing a security analysis in two steps: they first establish that the value of the threshold can not be manipulated by the attacker, and then analyse the protocol considering a fixed threshold. Such a protocol can be encoded in a natural way in our calculus using the let instruction [z := v] that allows one to extract a timing information from a message. We analysed these protocols considering one instance of each role.


### **8 Conclusion**

We presented a novel procedure for reasoning about distance bounding protocols which has been integrated in the Akiss tool. Even though termination is not guaranteed, the tool did terminate on all practical examples that we have tested.


Directions for future work include improving performances of our tool and this can be achieved by parallelising our algorithm (each trace can actually be analysed independently) and/or proposing new techniques to reduce the number of interleavings. Another interesting direction would be to add the exclusive-or operator which is often used in distance bounding protocols. This will require a careful analysis of the completeness proof developed in [3] to check whether their resolution strategy is compatible with the changes done here to accommodate timing constraints.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **On the Formalisation of** *Σ***-Protocols and Commitment Schemes**

David Butler1,2(B), David Aspinall1,2, and Adri`a Gasc´on1,3

 The Alan Turing Institute, London, UK dbutler@turing.ac.uk University of Edinburgh, Edinburgh, UK University of Warwick, Coventry, UK

**Abstract.** There is a fundamental relationship between Σ-protocols and commitment schemes whereby the former can be used to construct the latter. In this work we provide the first formal analysis in a proof assistant of such a relationship and in doing so formalise Σ-protocols and commitment schemes and provide proofs of security for well known instantiations of both primitives.

Every definition and every theorem presented in this paper has been checked mechanically by the Isabelle/HOL proof assistant.

**Keywords:** Commitment schemes · <sup>Σ</sup>-protocols · Formal verification · Isabelle/HOL

### **1 Introduction**

In [8], Damgard elegantly showed how Σ-protocols can be used to construct commitment schemes that are perfectly hiding and computationally binding and thus showed how these two fundamental cryptographic primitives are linked. The properties of the resulting commitment scheme rely on the security of the underlying Σ-protocol. The relationship between the two is natural as Σ-protocols can be considered the building block for zero knowledge, and it is well known that commitment schemes and zero knowledge protocols are strongly related [9].

When properties of fundamental primitives are linked in such a way it is interesting to study them formally using a proof assistant to more deeply understand the nature of the relationship. In fact the proof provided of the security of the construction of commitment schemes from Σ-protocols in [8] is brief; thus to study it formally one has to consider the properties in more detail.

To achieve such a goal one must first formalise both primitives and then show the desired relations between them with respect to the individual formalisations for either primitive. To formalise and instantiate a primitive using a proof assistant one must first formalise the security definitions, then define the protocol

This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1.

c The Author(s) 2019

F. Nielson and D. Sands (Eds.): POST 2019, LNCS 11426, pp. 175–196, 2019. https://doi.org/10.1007/978-3-030-17138-4\_8

that realises the primitive and then provide proofs in the proof assistant that show the defined security properties are met by the protocol.

As well as providing a deeper insight and more rigorous proof for properties in cryptography, formalisations also provide an increased confidence in cryptographic proofs. This increased level of rigour was called for by Halevi in 2005 [12], where it was proposed to approach the problem formally. One aspect of this approach is that security definitions are formally defined in an abstract way and then instantiated for different protocols that hold those security properties. The advantage of the abstract definitions is a human checker only needs to check these definitions for consistency with the literature to have confidence in the whole collection of proof. This is exactly the goal of this work.

In this paper, motivated by understanding the connection between Σ-protocols and commitment schemes we use the proof assistant Isabelle/HOL to formally reason about the two fundamental primitives and then show how a Σ-protocol can be used to construct a commitment scheme. Specifically we formally prove, with respect to our abstract definitions of commitment schemes, how the Schnorr Σ-protocol can be used to construct a perfectly hiding and computationally binding commitment scheme. In the process of achieving this we prove various Σ-protocols and commitment schemes secure in their own right.

To the best of our knowledge this is the first time the connection between Σ-protocols and commitment schemes has been considered using a proof assistant. Σ-protocols were considered in [5] and [2] and the Pedersen commitment scheme has been considered formally using EasyCrypt in [17]. We leave a discussion of the comparison of Isabelle/CryptHOL and EasyCrypt to Sect. 10.

Our formal proofs can be found at [1].

#### **Contributions**


proofs, can now be used by others completing cryptographic proofs inside Isabelle.

– All our protocols are shown secure in both the concrete and asymptotic cases.

**Outline.** In Sect. 2 we outline the structure of our formalisation and in Sect. 3 introduce the relevant theory of Isabelle/HOL and the main parts of CryptHOL [15]. In Sects. 4 and 6 we introduce our formalisation of Σ-protocols and commitment schemes receptively. Sections 5 and 7 show how we instantiate these abstract frameworks for the Schnorr and Pedersen protocols. We show how we link the two in Sect. 8 and how we construct a commitment scheme using the Schnorr Σ-protocol. Finally we conclude and discuss related work and provide a comparison with EasyCrypt in Sects. 9 and 10.

### **2 Formalisation Overview**

In this section we first outline the structure of our formalisation and then discuss the process of instantiating the abstract frameworks to achieve formal proof in Isabelle. Then we discuss the proof method for the asymptotic security setting.

#### **2.1 Outline of Formalisation**

We begin our formalisation by abstractly defining the security properties required for both commitment schemes and Σ-protocols. This part of the formalisation is defined over abstract types, giving the flexibility for it to be instantiated for any protocol; this allows us to have confidence in the proof's integrity when considering a range of different protocols. The abstract nature of the formalisation will also allow others to use the definitions of security and structure we provide to prove security of other commitment schemes and Σ-protocols. Another benefit is we can prove some technical lemmas at the abstract level and have them at our disposal in any instantiation, thus reducing the workload for future proofs. A final advantage of the abstract definitions is that a human checker only needs to verify these definitions for consistency with the literature to have confidence in the whole collection of proof.

We instantiate the abstract frameworks to prove security of the Pedersen commitment scheme, the Schnorr Σ-protocol and a second Σ-protocol that uses a relation for the equality of discrete logarithms. Finally we use the algorithms that define the Schnorr protocol to construct a commitment scheme (as shown in [8]) and prove it secure with respect to our commitment scheme definitions using the properties obtained from the Σ-protocol proofs.

The work flow of this paper can be seen in Fig. 1 where an arrow implies the use of one theory (a formalised file in Isabelle) to achieve the next. For example we prove the 'Schnorr commitment' secure with respect to the 'Abstract Commitments' definitions and using the algorithms and properties of the 'Schnorr' protocol.

**Fig. 1.** The structure of the formalisation in Isabelle

#### **2.2 Instantiating the Abstract Frameworks**

At a technical level Isabelle's module system (called locales) allows the user to prove theorems abstractly, relative to given assumptions. These theorems can be reused in situations where the assumptions themselves are theorems. In our case locales allow us to define properties of security relative to fixed constants and then instantiate these definitions for explicit protocols and prove the security properties as theorems.

The overall process of instantiation can be seen as a step-by-step process given below:


### **2.3 Asymptotic Security**

In our formalisation we first consider security in the concrete setting. Here we assume the security parameter, n, is implicit in all algorithms that parametrise the framework. We then move to prove security in the asymptotic setting utilising Isabelle's module system. More details about this part of our formalisation are given in Sect. 8.1. We note the asymptotic setting is not considered in EasyCrypt proofs, the machinery available in Isabelle however makes it possible.

### **3 CryptHOL and Isabelle Background**

In this section we briefly introduce the Isabelle notion we use throughout and then highlight and discuss some important aspects of CryptHOL. For more detail on CryptHOL see [6]. The full formalisation is available at [14].

#### **3.1 Isabelle Notation**

The notations we use in this paper resemble closely the syntax of Isabelle/HOL (Isabelle). For function application we write f(x, y) in an uncurried form for ease of reading instead of fxy as in the λ-calculus. To indicate that term t has type τ we write t :: τ . Isabelle uses the symbol ⇒ for the function type, so a ⇒ b is the type of functions that takes an input of type a and outputs an element of type b. The type 'a denotes an abstract type. The implication arrow −→ is used to separate assumptions from conclusions inside a closed HOL statement. We use ⊗ to denote multiplication in a group and ∗ for multiplication in a field.

#### **3.2 CryptHOL**

CryptHOL [6] is a framework for reasoning about cryptography in the computational model that is embedded inside the Isabelle/HOL theorem prover. It allows the prover to write probabilistic programs and reason about them. The computational model is based on probability theory and in particular uses probabilistic programs to define security—this can be seen for the construction of games in the game-based setting or the real and ideal views in the simulation-based setting.

To build the probabilistic programming framework CryptHOL uses the existing probability theory formalised inside Isabelle to define discrete probability distributions called sub probability mass functions (of type *spmf* ). These can be thought of as probability mass functions with the property they do not have to sum to one—we can lose some probability mass. This allows us to model failure events and assertions.

**Writing Probabilistic Programs.** CryptHOL provides some, easy-to-read, Haskell-style do notation to write probabilistic programs where **do**{x ← p; f(x)} is the probabilistic program that samples from the distribution p and returns the *spmf* produced by f. We can also return an *spmf* using the monad operation *return*. See Fig. 2 for an example.

Proofs of security are mainly completed by manipulating the appropriate probabilistic programs. While the proofs that each manipulation is valid are not always accessible to non-experts, the effect of each manipulation can be easily seen and recognised as they are explicitly written in the do notation.

**Failure Events and Assertions.** We often have to reason about failure events. For example we must ensure the adversary in the hiding game (Fig. 6) outputs two valid messages for the game to proceed. Such events are handled using assertion statements

$$assert(b) = if\ b\ then\ return(\text{\textquotedbl{}else\textquotedbl{}})\ else$$

and the *TRY p ELSE q* construct. For example T RY **do** {p} ELSE q would distribute the probability mass not assigned by p to the distribution according to q. Picking up on our example of the hiding game; if the adversary fails to output two valid messages, the assertion fails and the *ELSE* branch is invoked resulting in the adversary not winning the hiding game.

**Sampling.** Sampling from sets is important in cryptography. CryptHOL gives an operation *uniform* which returns a uniform distribution over a finite set. We use two cases of this function extensively: by *sample uniform*(*q*), where q is a natural, we denote the uniform sampling from the set {.. < q} and by coin we denote the uniform sampling from the set {T rue, F alse}—a coin flip.

Using sampling we are able to illustrate one difference in thought process and rigour required in a formal proof compared to a pen-and-paper proof. One time pads (OTPs) are used extensively in protocols. Often their properties are employed without thought or explanation in paper proofs as they are considered to be a simple construct. However there are some more subtle issues that sometimes need to be considered.

$$\text{map}((\lambda b.\ (x\*b)\bmod q), (sample\\_uniform(q))) = sample\\_uniform(q) \tag{1}$$

Equation 1 shows the traditional OTP for multiplication in a field; a uniform sample, b, from a set of q elements, multiplied to an input, x, taken modulo q is the same as a uniform sample. However this property is only valid if x and q are coprime. This follows, in the finite field, from Fermat's Little Theorem; thus formally we have to work much harder to use such a lemma. In short, formalising a proof demonstrates many areas where a paper proof falls short in detail.

**Probabilities.** We must also be able to reason about the probability of events occurring. So, P[Q = x] denotes the subprobability mass the spmf Q assigns to the event x. We also introduce the notation which denotes the binding of a sample without the need for the do notation. This can be seen in Theorem 3 where the bound variable e is sampled from *challenge* and given to S2.

**Negligible Functions.** To reason about security in the asymptotic case we must consider negligible functions. These were formalised as a part of CryptHOL. A function, f :: (nat ⇒ real) is said to be negligible if

$$(\forall c > 0 \, . \, f \in o(\lambda x.inverse(x^c)))$$

where o is the little o notation. We discuss the use of such functions in our proofs in Sect. 8.1.

### **4 Formalising** *Σ***-Protocols**

In this section we show how we formally define Σ-protocols and their security properties—it is with respect to these definitions that we prove security of the Schnorr protocol and a variant of it for equality of discrete logs in Sect. 5. For more details on the Σ-protocols see [9].

#### **4.1 Definition of** *Σ***-protocols**

We first consider a binary relation R; for some (h, w) that satisfies R, w is a witness of h. For example, the discrete log relation is formalised as follows

$$R\_{DL}(h, w) = \left(h = g^w \land w < |G|\right) \tag{2}$$

where g is a generator of the cyclic group G.

A Σ-protocol is a two party protocol run between a prover (P) and a verifier (V ). In the protocol h is a common input to both P and V and w a private input to P such that R(h, w) is true. We define the structure of a Σ-protocol as follows:

**Definition 1.** *A* Σ*-protocol has the following three part form:*


*A* conversation *can be seen as the tuple* (a, e, z)*.*

Formally we model this as four abstract probabilistic programs whose types are given below. The inputs to relation R are h, of type 'pub input and w, of type 'witness.

$$(init\ \vdots \text{\textquotedblleft}pub\text{\textquotedblright} \Rightarrow \text{\textquotedblleft}witness \Rightarrow (\text{\textquotedblleft}rand \times \text{\textquotedblleft}msg\text{\textquotedblright} \tag{3})$$

*challenge* :: '*challenge spmf* (4)

*response* :: '*rand* ⇒ '*witness* ⇒ '*challenge* ⇒ '*response spmf* (5)

$$^i
check{=} : ^i
pub .input \Rightarrow ^i
msg \Rightarrow ^i
cha
llenge \Rightarrow ^i
response \Rightarrow \, ^i
bool \quad \text{spmf} \qquad \text{(6)}$$

The challenge sent by V is defined as a random sampling (see [9]) therefore it needs no inputs here.

It is interesting to note, unlike many paper based definitions, none of our algorithms in the formalisation need take random coins as input. This is because they are already probabilistic programs and thus not deterministic by definition.

The three properties that define a Σ-protocol are completeness, special soundness and honest verifier zero-knowledge (HVZK). Special soundness ensures the prover cannot prove a false statement and HVZK says the verifier learns nothing of the witness that it cannot learn from the output of the verification and the public input.

**Definition 2.** *Assume the protocol run between* P *and* V *has the above form then it is said to be a* Σ*-protocol if the following properties hold:*

*– Completeness: if* P *and* V *follow the protocol on public input* h *and private input* w *such that* R(h, w) *is satisfied, then* V *always accepts.*

*complete*(h, w) ≡ R(h, w) −→ P[*complete game*(h, w) = T rue]=1

*– Special soundness: there exists an adversary,* A *such that when given a pair of accepting conversations (on public input* h*)* (a, e, z) *and* (a, e , z ) *where* e = e *it can compute* w *such that* R(h, w) *is satisfied.*

*s soundness*(*h*,*w*) ≡ ∃A. R(h, w) −→ P[*s soundness game*(*h*,*w*, A) = T rue]=1

*– Honest verifier Zero-Knowledge: There exists a simulator* S *that on input* h *and challenge* e *outputs an accepting conversation* (a, e, z) *with the same probability distribution as the real conversations* (real view) *between* P *and* V *on input* (h, w)*.*

$$HVZK(h, w) \equiv R(h, w) \longrightarrow (real\\_view(h, w) = challenge \rhd \rhd S(h, e))$$

In the literature the adversary for the special soundness definition and the simulator in the HVZK definition must run in polynomial time. There are challenges in formalising this notion, therefore we visually verify that the adversaries we construct for special soundness run in polynomial time and do not provide a formalisation of this property.

We define the probabilistic program complete game to run the components of the protocol in an honest way. In particular we define a probabilistic program that takes as input (h, w), and then runs the four probabilistic programs of the protocol as would be done in the protocol, finally outputting the output of check.

The probabilistic program s soundness game is slightly more subtle. The game takes as input (h, w, A) and must construct two accepting views to give to the adversary. The condition on these views is that the challenge in the second view must be different to that of the first. On paper this is easy to reason about as it can be considered to be intuitive but formally we must work harder. We define a new probabilistic program, *snd challenge*(*e*), that outputs a challenge different from the original. For example, for the Schnorr protocol the challenge is a uniform sample from the field. Consequently the second challenge must uniformly sample from all elements of the field modulo the first challenge p,

$$s \, snd \, \text{\textquotedblleft} \text{\textquotedblright} (q, p) = uniform(\{\ldots < q\} - \{p\}) \tag{7}$$

We must then prove all properties we required of *challenge* with respect to the new definition.

In the honest verifier zero knowledge property the real view is a probabilistic program that defines the real view (i.e., the protocol) transcript of the execution, that is (a, e, z). Intuitively if one can simulate the real view then we know there is no leakage of data, in this case the witness, during an execution of the protocol. We note that unlike previous work on the simulation based proof method [7] in MPC where the real view could only be defined in the instantiation due to different protocol structures, here we can define it solely from the algorithms used in the Σ-protocol. Both the special soundness game and the definition of the real view can be seen in Fig. 2.

Having made the above definitions we can define Σ-protocols formally as follows.

#### **Definition 3**

```
Σ -protocol(h,w) = complete(h,w) ∧ s soundness(h,w) ∧ HVZK(h,w)
```
Referring back to the diagram in Fig. 1 we can see we have completed the work for the 'Abstract Σ-protocol' box.

*special soundness game*(*h*, *w*, <sup>A</sup>) = do { *real view*(*h*, *w*) = *do* { (*r*, *a*) *init*(*h*, *w*); (*r* , *a*) *init*(*h*, *w*); <sup>e</sup> *challenge*; <sup>e</sup> *challenge*; <sup>z</sup> *response*(*r*, *w*, *e*); <sup>z</sup> *response*(*r*, *w*, *e*); e- *snd challenge*(*e*); *return*(*a*, *e*, *z* )} z- *response*(*r*, *w*, *e*- ); w- <sup>A</sup>(h, (a, e, z), (a, e- , z- )); *return*(*w* <sup>=</sup> *w*- )}

**Fig. 2.** Definitions of the special soundness game and the real view for Σ-protocols.

### **5 Formalising the Schnorr** *Σ***-Protocol**

In this section we describe the proof of security of the Schnorr Σ-protocol. We also formalise the proof of security for a second Σ-protocol that is based on the equality of discrete logs [9]. The relation for this protocol can be seen in Eq. 8, where g is a second generator of the cyclic group G.

$$R((h, h'), w) = \{h = g^w \land h' = g'^w \land w < |G|\}\tag{8}$$

In the interest of space here we only detail the formalisation of the Schnorr protocol. Here we provide more details of the formalisation than in other parts of the paper as well as a higher level commentary.

#### **5.1 The Schnorr** *Σ***-protocol**

The Schnorr protocol uses a cyclic group G with generator g and considers the discrete log relation, R*DL*, that can be seen in Eq. 2. The protocol is given in Fig. 3. The notation \$ ←− denotes uniform sampling while we use ←− to denote assignment.

**Fig. 3.** The Schnorr Σ-protocol.

We consider the three properties that define a Σ-protocol in turn.

Completeness comes directly by unfolding the definitions and proving the identity <sup>g</sup>*<sup>r</sup>* <sup>⊗</sup> (g*<sup>w</sup>*)*<sup>e</sup>* <sup>=</sup> <sup>g</sup>*<sup>r</sup>*+*w*∗*<sup>e</sup>*. This is trivial, but provides Isabelle with a hint as to how to rewrite the definitions to dismiss the completeness proof.

**Theorem 1.** *Assume* R*DL*(h, w) *then*

P[completeness game(h, w) = T rue]=1

Secondly we must prove special soundness. To prove this we must construct an adversary that can extract the witness from two correct executions of the protocol. The special soundness adversary is given in Fig. 4.

$$\begin{array}{l} \mathcal{A}\_{ss}(h, c\_1, c\_2) = do \ \{ \\ \qquad \text{let } (a, e, z) = c1; \\ \qquad \text{let } (a', e', z') = c2; \\ \qquad \text{return}(\text{if } e > e' \text{ then } (z - z')\* \{ \text{sft}(bezw((e - e'), |G|)) \bmod |G| \}) \\ \qquad \quad \text{else } (z' - z)\* \{ \text{sft}(bezw((e' - e), |G|)) \bmod |G| \}) \} \end{array}$$

**Fig. 4.** The adversary used to prove special soundness for the Schnorr Σ-protocol. Note the output is equivalent to *<sup>z</sup>*−*z*- *e*−*e*- . In the proof of Theorem 2 we have a = a for the messages given to the adversary.

We highlight an important contribution of our work here. The output of <sup>A</sup>*ss* appears complex but actually is equivalent to <sup>A</sup>*ss* outputting *<sup>z</sup>*−*z*- *e*−*e* in the field. The output uses Bezout's function (*bezw*) for finding a pair of witnesses for Bezout's theorem to realise the inverse of e−e . This function is given in the Isabelle number theory library.

The reason we could not define the adversary as outputting a simple division is worthy of some technical discussion. The inputs to the division are of type natural and thus any output from the division is also required to be of the same type as we are working in a field. However the output type of a division on naturals in Isabelle is a real number. Thus we must work around to output the correct value as a natural. The condition on e>e is such that the denominator is never negative as we are working with naturals. While this may look like an unnatural solution to the issue it is an effective one, and the work we provide here can be used by others when this problem arises again (it almost certainly will as division in a field is not uncommon in cryptography!). To allow us to work with an adversary defined in such a way we must prove lemmas of the form:

**Lemma 1.** *Assume* a, b, w, y < |G|*,* a = b *and* w = 0 *then*

$$w = \begin{pmatrix} if \ (a > b) \ then \end{pmatrix} \begin{pmatrix} (w \ast a + y) - (w \ast b + y) \end{pmatrix} \ast fst(b \varepsilon zw((a - b), |G|) & \text{(9)}\\else((w \ast b + y) - (w \ast a + y)) \ast fst(b \varepsilon zw((b - a), |G|) & \\ \end{pmatrix}$$

The proof of Lemma 1 is quite involved however lemmas we require for other instances follow a similar proof method. We also prove a general lemma to compute divisions in a finite field, this is given in Lemma 2.

**Lemma 2.** *Assume* gcd(a, |G|)=1 *then*

$$[a \ast fst(bezw(a, |G|)) = 1](mod |G|)$$

To apply statements such as Lemma 1 we often have to employ congruence rules. These allow the simplifier to use the context, in particular here on facts pertaining to the bound variables in the probabilistic programs that are required as assumptions. Using these methods we can prove special soundness.

**Theorem 2.** *Assume* R(h, w) *then we have*

$$\mathcal{P}[special\\_soundness\\_game(h, w, \mathcal{A}\_{ss}) = True] = 1$$

Finally we must prove honest verifier zero knowledge. This requires us to define the real view of the protocol and show that there exists a simulator that takes as input the public input and a challenge and outputs a view that is indistinguishable from (equal as probability distributions) the real view. This technique follows the technique of simulation based proofs that was formally introduced in Isabelle in [7]. The probabilistic program defining the simulator along with the unfolded definition of the real view is given in Fig. 5.

To show HVZK we prove the two views are equal. That is,

**Theorem 3.** *Assume* R*DL*(h, w) *then we have*

$$real\\_view(h, w) = (channelage \rhd (\lambda e. S(h, e)))$$

In the definitions given in Fig. 5 the number of random samples is different in each view. We note that the extra sampling for the simulation comes from the challenge which, by definition is sampled before being given to the simulator. To prove honest verifier zero knowledge we manipulate the real view into a form where we can use Eq. 10, that describes a one time pad for addition in the field.

$$\operatorname{map}(\lambda b.\ (y+b)\bmod q,\ uniform(q)) = \operatorname{uniform}(q)\tag{10}$$

real view(h, w) = do { <sup>S</sup>(h, e) = do { <sup>r</sup> *uniform*(|*G*|); <sup>c</sup> *uniform*(|*G*|); *let*(*r*, *a*)=(*r*, *g <sup>r</sup>* ); *let a* <sup>=</sup> *<sup>g</sup><sup>c</sup>* <sup>⊗</sup> (*h<sup>e</sup>* ) <sup>−</sup>*<sup>1</sup>* ; <sup>c</sup> *uniform*(|*G*|); *return* (*a*, *e*, *z* )} *let z* = (*w* <sup>∗</sup> *c* <sup>+</sup> *r*) *mod* <sup>|</sup>*G*|; *return* (*a*, *c*, *z* )}

**Fig. 5.** The unfolded real view and simulator for the Schnorr protocol

To do this we must prove some basic identities about groups that provide Isabelle with hints as to rewrite the probabilistic programs. After proving the three properties we can show that the Schnorr protocol satisfies the definition of a Σ-protocol.

**Theorem 4.** *For the Schnorr* Σ*-protocol we have*

*Σ -protocol*(*h*,*w*).

### **6 Formalising Commitment Schemes**

In this section we introduce our formalisation of commitment schemes and their security properties. Commitment schemes are a cryptographic primitive, run between a committer C and a verifier V , that allow the committer to commit to a chosen message, while keeping it private, and at a later time reveal the message that was committed to. For more details on commitment schemes we refer the reader to [20].

There are three phases to a commitment scheme:


We formally model the three phases by fixing the types of three probabilistic programs (*key gen*, *commit*, *verify*), seen in the locale given in Fig. 6.

The key generation algorithm outputs the keys available to the committer (ck) and the verifier (vk) respectively. If all the keys are public then we have ck = vk. We also fix two predicates abstractly which are needed in concrete instances later; *valid msg* checks if a message is valid or not and A *cond* provides the conditions that we require from an adversary in the binding game. A paper proof can easily dismiss the adversary as failing if these conditions are not met,


**Fig. 6.** Abstract commitment scheme locale and definitions.

however formally we must catch this in the semantics. In fact these predicates serve another purpose too; they allow us to use the properties captured by the predicates in our reasoning at a later point in the proof. For example, for m to be a valid message we may require m ∈ G, this fact is then known to Isabelle for later use (e.g when applying Eq. 11).

#### **6.1 Properties of Commitment Schemes**

There are two main properties associated with commitment schemes: the hiding and binding properties. We note we consider a third property of correctness also, the need for this is explained at the end of the section.

**Hiding.** Intuitively, the hiding property is that no adversary can distinguish two committed messages. To define the hiding property we define the *hiding game* between an adversary, A, and a benign challenger. The formal game can be seen in Fig. 6. The game asks the adversary to output two messages, one of which is committed to by the challenger and the corresponding commitment handed back to the adversary. The adversary is then asked to guess which message was committed to. The adversary wins the game if they correctly output which message was committed and handed to them.

Using the hiding game we can define the hiding advantage.

**Definition 4.** *The hiding advantage is the probability an adversary has of winning the hiding game.*

*hiding advantage*(A) ≡ P[*hiding game*(A) = *True*]

Using this we can define perfect hiding, which holds for the Pedersen commitment scheme.

**Definition 5.** *For perfect hiding we require*

$$\text{ perfect } \mathsf{L} \\ \text{idding}(\mathcal{A}) \equiv (\mathsf{h} \\ \text{iding}\text{-advantage}(\mathcal{A}) = \frac{1}{2}).$$

**Binding.** The binding property ensures that the committer cannot change her mind and change the message she has committed to. Again a security game is used (see Fig. 6). We challenge the adversary to bind two messages (m, m ) and two opening values (d, d ) to the same commitment c.

Similar to the hiding property we define the binding advantage:

**Definition 6.** *The binding advantage is the probability an adversary has of winning the binding game.*

*binding advantage*(A) ≡ P[*binding game*(A) = *True*]

To show computational binding we must show the binding advantage is a negligible function with respect to the security parameter. This result can only be shown in the asymptotic setting as it requires an explicit security parameter. In the concrete setting we can show a reduction to a known hard problem (for the Pedersen scheme this is the discrete logarithm problem). We can then extend to the asymptotic setting. See Sect. 8.1 for more details on our proofs in the asymptotic setting.

**Correctness.** There is one subtlety to the binding definition meaning we must consider correctness also. If the verifier always outputs false, the binding property is met as the adversary will never win the game.

Correctness is the property that, assuming honest parties, a commitment will be verified as true by the verifier. Analogously to the hiding and binding properties we use the correctness game to define correctness.

#### **Definition 7**

*correct*(*m*) ≡ (P[*correct game*(*m*) = *True*] = *1* )

### **7 The Pedersen Commitment Scheme**

In this section we discuss our formalisation of the Pedersen commitment scheme. We do not discuss the proofs in detail, but instead provide the formal results and a discussion of the interesting aspects learned from the proof.

The protocol, given in Fig. 7, is run using a cyclic group of prime order G with generator g.

Intuitively, the hiding property is observed because the message, m, is not sent explicitly, but is masked by the uniform sample, g*<sup>d</sup>* (in g*<sup>d</sup>*.pk*<sup>m</sup>*). Consequently the verifier cannot distinguish between two committed messages. The property of binding is more subtle. If the adversary can bind two messages to the same committed value, then the adversary can also compute the discrete log of pk, which is in violation of the discrete log assumption which is considered hard. Correctness is immediate; the committed value, c, is g*<sup>d</sup>*.pk*<sup>m</sup>*, the verifier accepts the message if (m, d), sent by the committer is such that g*<sup>d</sup>*.pk*<sup>m</sup>* = c.

pk \$ G Committer Verifier

**Fig. 7.** The Pedersen commitment protocol.

#### **7.1 Formal Proofs for the Pedersen Protocol**

We fix a finite cyclic group, G, with generator, g, and order, |G| and explicitly define the probabilistic programs that define the protocol.

**Perfect Hiding.** Lemma 3 shows that we have perfect hiding for the Pedersen commitment scheme.

**Lemma 3.** *For the Pedersen commitment scheme we have*

$$\mathcal{P}[(hiding\\_game(\mathcal{A})) = True] = \frac{1}{\mathcal{B}}$$

The security of the hiding property comes from applying the OTP lemma:

$$\begin{aligned} c \in \operatorname{carrier} \ G \Rightarrow \operatorname{map}((\lambda x. \, g^x \otimes \, c), (\operatorname{uniform}(|G|))) &= \\ & \operatorname{map}((\lambda x. \, g^x), \operatorname{unform}(|G|))). \end{aligned} \tag{11}$$

The work needed to apply the one time pad Lemma is in showing that c ∈ carrier G. To do this requires the use of some congruence lemmas as the property of membership of the carrier group arises from conditions on bound variables. Applying the one time pad Lemma shows the value given to the adversary is independent of m*b*. Consequently the output from the adversary can be nothing more than a guess, in other words, the adversary may as well flip a coin to decide its output.

**Computational Binding and Correctness.** To prove the binding property we show a reduction to the discrete logarithm assumption. Hardness assumptions are a cornerstone of cryptography so we take a moment to consider how it may be formal used. One can follow a similar pattern for defining other hardness assumptions, for example see how the DDH assumption is defined in [16].

We first define the task of the adversary and then the advantage associated to the adversary. In the case of the discrete log assumption we simply provide the adversary with g*<sup>x</sup>* where x is uniformly sampled and ask it to output x. We formalise such a situation as a game between the adversary and a challenger in Fig. 8.

$$\begin{array}{l} dis\text{\textquotedblleft} game(\mathcal{A}) \; = do \; \{ \\\ x \leftarrow sample\\_uniform(\left| \left| G \right|); \\\ \text{\textquotedblright} \; h = g^x; \\\ x' \leftarrow \mathcal{A}(h); \\\ \text{return}(x = x') \} \end{array}$$

**Fig. 8.** The discrete log game.

We then define the associated advantage of the adversary when playing this game—the probability of it winning the game.

**Definition 8.** *dis log advantage*(A) ≡ P[(*dis log game*(A)) = *True*]

To prove binding we construct an adversary, *dis log* A, using the adversary A that plays the binding game and show it has the same advantage against the discrete log game as A has against the binding game. Our adversary here takes a similar form as that used in the proof of special soundness for the Schnorr protocol. Using this we can show Lemma 4 which easily shows Theorem 5.

### **Lemma 4**

$$\operatorname{bind} \mathfrak{z} game(\mathcal{A}) = \operatorname{dis\\_log\mathfrak{z}} game(\operatorname{dis\\_log\mathcal{A}(\mathcal{A})})$$

**Theorem 5**

*bind advantage*(A) = *dis log advantage*(A)

Finally we prove the correctness of the Pedersen scheme. This result comes easily after proving some group identities in Isabelle.

**Theorem 6.** *We have*

$$\mathcal{P}[(correct\\_game(m)) = True] = 1.$$

### **8 Using** *Σ***-Protocols to Construct Commitment Schemes**

In [8], it was shown how commitment schemes can be constructed from Σprotocols. One can use the components of a Σ-protocol, for a relation R, to form a commitment scheme as follows:

*Key Generation.* The keys are generated such that the verifier receives (h, w) ∈ R that satisfy R and the committer receives only h.

*Commit.* The committer runs the simulator on their key h and the message, m, they wish to commit. That is they run

$$(a, e, z) \leftarrow S(h, m)$$

and sends a to the verifier and keeps e and z as the opening values.

*Verify.* In the verification stage the prover sends e and z to the verifier who uses the check algorithm of the Σ-protocol to confirm that (a, e, z) is an accepting conversation, with respect to the private key w.

We recall that in the *Commit* stage we have e = m. The resulting commitment scheme can be shown to be perfectly hiding and computationally binding. Intuitively perfect hiding comes from the fact that the simulation is perfect (the simulated and real views are equal) and that the initial message is not dependent on the challenge. Binding holds as if a prover could output two sets of opening values, (e, z) and (e , z ), for one commitment a such that e = e then (a, e, z) and (a, e , z) would both be accepting conversations and by the special soundness property we could compute w, but this contradicts the hardness assumption on R. We formally prove these results in Isabelle for the commitment scheme constructed from the Schnorr Σ-protocol.

To formalise this in Isabelle we again explicitly define the constants required for commitment schemes. We define these using the constants defined for the Schnorr Σ-protocol. This requires us to import both locales (for commitment schemes and Σ-protocols) and prove all assumptions relating to them. We are then able to prove perfect hiding, computational binding and correctness of the resulting commitment scheme. In the concrete setting we show a reduction of the binding property to the discrete logarithm assumption.

**Theorem 7.** *For the commitment scheme constructed from the Schnorr protocol we have*

$$\mathcal{P}[correct\\_game(m) = True] = 1$$

$$\mathcal{P}[hiding\\_game(\mathcal{A}) = True] = \frac{1}{2}$$

$$\mathcal{P}[bind\\_game(\mathcal{A}) = True] = \mathcal{P}[dis\\_log(dis\\_log\mathcal{A}(\mathcal{A})) = True]$$

This result has taken an instantiated Σ-protocol, used its components to instantiate a commitment scheme and proven this secure with respect the definitions we formalised for commitment schemes.

#### **8.1 Asymptotic Case**

So far in our formalisation the security parameter has been assumed to be implicit in all algorithms (probabilistic programs). In this section we show how we formalise proofs in the asymptotic setting using as an example the commitment scheme we have just constructed using the Schnorr Σ-protocol. In our formalisation we provide proofs in the asymptotic setting for all instantiations. The asymptotic setting is particularly interesting in the case of commitment schemes as we can consider computational binding; a full proof will show the adversary has only negligible chance of winning the binding game.

To realise such a proof we parametrise over the family of cyclic groups, specifically we change the type from '*grp cyclic group* to *nat* ⇒ '*grp cyclic group*. Thus the cyclic group is parametrised by the security parameter—a natural. After importing the concrete setting parametrically for all n, all algorithms now depend explicitly on the security parameter. Moreover, due to Isabelle's module structure we are able to use results proven in the concrete setting in our newly constructed asymptotic setting. It is worth noting that results from the concrete setting can only be used once it has been proven that the import is valid, something the user is required to do when importing a module.

The properties, in the asymptotic case, for correctness and hiding can be seen in Theorem 8. Superficially, the only difference is the security parameter is an input to every statement and function. At a deeper level the proof uses the equivalent theorems from the concrete setting and the module machinery to dismiss the proof.

**Theorem 8.** *In the asymptotic case, for security parameter,* n*, we have:*


The more interesting case is the proof of computational binding as we are required to show the binding advantage is negligible. In the concrete setting (Theorem 7) we showed we could construct an adversary that had the same advantage against the discrete log problem as the binding game. In the asymptotic setting we are able to assume that the discrete logarithm assumption holds; that an adversary only has a negligible chance of winning the discrete log game. Using this we can prove that the binding advantage too is negligible. This is shown in Theorem 9.

#### **Theorem 9**

$$\begin{aligned} \text{in } & \text{neighbors} \left( \lambda n. \text{ } \mathit{bind\\_advantage\\_n} \left( \mathcal{A} \ n \right) \right) \iff \\ & \text{neighbors} \left( \lambda n. \text{ } \mathit{dis\\_log\\_advantage\\_n} \left( \mathit{dis\\_log\mathcal{A}} \ n \ (\mathcal{A} \ n) \right) \right) \end{aligned}$$

Our formalisation provides proofs in the asymptotic case for all relevant properties presented in this paper in a similar manner as described above. We refer the reader to our formalisation for more details.

### **9 Conclusions**

In this work we have demonstrated that commitment schemes and Σ-protocols can be formally proved secure in the computational model using a general abstract framework. Our work uses Isabelle/HOL and its modularity mechanisms, but in principle could be replicated in other interactive theorem provers. The abstract frameworks can be used by others to formalise new commitment schemes and Σ-protocols. The advantages of reasoning back to the same general framework is that one can be sure the correct properties and definitions are being considered. This consistency is not always apparent in informal cryptographic proofs. We suggest that cryptographic advances should be monitored within a formal framework where one is required to use the exact definitions set out formally (the proof could be done on pen and paper) or provide a formal proof that the chosen definitions are equivalent. This will help alleviate the abundance of small differences in definitional approaches which undermine the field.

At the present state-of-the-art, prototyping this approach in an interactive theorem prover seems essential as it allows one to explore the reasoning and definition principles which are most effective in the domain. Eventually we may hope that bespoke foundational reasoning tools could be built which may be more usable by applied cryptographers (as is the aim of EasyCrypt, although it is not foundational).

One major advantage of our framework being implemented in Isabelle is that we can benefit from the vast infrastructure that comes with a well developed theorem prover, in contrast with custom made tools. We benefit from the interactive nature of Isabelle meaning users have flexibility but also the high level of automation and many proof engines available.

While CryptHOL and thus our framework still require a high level of specific interactive theorem proving knowledge to use, new features are being developed that make it more usable by the working cryptographer. For example recent work in Isabelle [19], monad normalisation, has made handling the commuting of samplings, a previously technical and subtle exercise, more simple. As more features like this automate the intricate details needed in proofs the barrier to entry to using CryptHOL will be significantly reduced.

**Future Work.** The frameworks we provide here can be used and instantiated to give formal proofs of new commitment schemes and Σ-protocols. Both of these primitives are used to provide security in the malicious model, consequently we see this work as a building block to further formal proofs here.

### **10 Related Work**

Little work has been done on formalising the computational model, compared to the symbolic model. It is challenging as it requires mathematical reasoning about probabilities and failure events, besides logic. The CertiCrypt [3] tool built in Coq helped to capture the reasoning principles that were implemented directly in the dedicated interactive EasyCrypt tool [4]. Again in Coq, the Foundational Cryptographic Framework [18] provides a definitional language for probabilistic programs, a theory that is used to reason about programs, and a library of tactics for game-based proofs.

CryptHOL [6], formalised in Isabelle, has been used in the game-based setting [16], as well as the simulation-based paradigm [7]. Isabelle is a foundational framework that relies on a set of accepted consistent axioms [10,13] and thus provides a high guarantee of correctness in proofs.

The Pedersen commitment scheme has been proved secure in EasyCrypt in [17]. One noticeable difference between the proof effort required is in the construction of the adversary used to prove computational binding. We had to work hard in Isabelle to give the output of the adversary in the binding game as a division of two elements in the finite field. We are required to prove extra properties of the Bezout function whereas the division can be easily expressed in EasyCrypt.

#### **10.1 Comparison with EasyCrypt**

EasyCrypt is considered the state of the art in terms of proof assistants for cryptography; it was designed as a dedicated tool for the working cryptographer. It has a larger user base than CryptHOL, partially due to it having been developed a number of years before and its greater support and documentation. The barrier to entry to using EasyCrypt is lower in comparison to Isabelle. We argue however that there is room for more than one proof assistant when considering cryptographic proof; in fact we suggest that it is essential to the development of formal proof in this area. Growth in an area of research is rarely achieved by considering only one approach; different proof assistants allow for different proof styles and thus different insights into the cryptographic proofs at a fundamental level.

One such difference in approach is the ability to follow paper proofs explicitly. Isabelle's deep and extensive foundations in mathematical logic meaning there is a large amount of machinery available to the user when completing proofs. This allows one to more closely follow the proof method given in the paper proof. In EasyCrypt sometimes this is not possible. For example in [11] the authors had to prove on paper that the security definitions they formalised were equivalent to the traditional definitions in the literature. At a technical level this is because the proof technique in EasyCrypt is often to reduce proofs to showing properties about the equivalence of programs. This is not necessarily a weakness of EasyCrypt, as it allows for new insights into proof techniques but it highlights a difference between the two systems.

Finally, when using CryptHOL in Isabelle all proofs are checked with respect the same logical core. That is, the whole CryptHOL framework resides within Isabelle. However, some fundamental properties in EasyCrypt are outsourced to be proven in Coq. Thus one could consider the approach of CryptHOL and Isabelle to be more foundational.

**Acknowledgements.** We are grateful to Andreas Lochbihler for providing and continuing to develop CryptHOL and for his kind help given with using it.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Orchestrating Layered Attestations**

John D. Ramsdell1(B), Paul D. Rowe<sup>1</sup>, Perry Alexander<sup>2</sup>, Sarah C. Helble<sup>3</sup>, Peter Loscocco<sup>4</sup>, J. Aaron Pendergrass<sup>3</sup>, and Adam Petz<sup>2</sup>

> <sup>1</sup> The MITRE Corporation, Bedford, USA ramsdell@mitre.org

<sup>2</sup> The University of Kansas, Lawrence, USA

<sup>3</sup> John Hopkins University Applied Physics Laboratory, Laurel, USA <sup>4</sup> National Security Agency, Fort Meade, USA

**Abstract.** We present Copland, a language for specifying layered attestations. Layered attestations provide a remote appraiser with structured evidence of the integrity of a target system to support a trust decision. The language is designed to bridge the gap between formal analysis of attestation security guarantees and concrete implementations. We therefore provide two semantic interpretations of terms in our language. The first is a denotational semantics in terms of partially ordered sets of events. This directly connects Copland to prior work on layered attestation. The second is an operational semantics detailing how the data and control flow are executed. This gives explicit implementation guidance for attestation frameworks. We show a formal connection between the two semantics ensuring that any execution according to the operational semantics is consistent with the denotational event semantics. This ensures that formal guarantees resulting from analyzing the event semantics will hold for executions respecting the operational semantics. All results have been formally verified with the Coq proof assistant.

### **1 Introduction**

It is common to ask a particular target system whether it is trustworthy enough to engage in a given activity. Remote attestation is a useful technique to support such trust decisions in a wide variety of contexts. Fundamentally, remote attestation consists in generating *evidence* of a system's integrity via *measurements*, and *reporting* the evidence to a remote party for appraisal. Depending on their interpretation of the evidence, the remote appraiser can adjust their decision according to the level of risk they are willing to assume.

Others have recognized the insufficiency of coarse-grained measurements in supporting trust decisions [8,10,20,22]. Integrity evidence is typically either too broad or too narrow to provide useful information to an appraiser. Very broad evidence—such as patch levels for software—easily allows compromises to go undetected by attestation. Very narrow evidence—such as a combined hash of the complete trusted computing base—does not allow for natural variation across systems and over time.

An alternative approach is to build a global picture of system integrity by measuring a subset of system components and reasoning about their integrity individually and as a coherent whole. This approach can give an appraiser a more nuanced view of the target system's state because it can isolate integrity violations, telling the appraiser exactly which portions of the system can or cannot be trusted. We call this approach *layered attestation* because protected isolation frequently built into systems (e.g. hypervisor-enforced separation of virtual machines) allows the attestation to build the global picture of integrity from the bottom up, one layer at a time. A layered attestation whose structure mimics the layered dependency structure of a target system can provide strong trust guarantees. In prior work, we have formally proved that "bottom-up" strategies for layered attestation force an adversary to either corrupt well-protected components or work within small time-of-check-time-of-use windows [17,18].

The "bottom-up" principle has been embodied in many attestation systems (e.g. [2,6,7,10,22]). A common tactic in these papers is to design the target system and the attestation protocol in tandem to ensure the structure of the attestation corresponds to the structure of the system. This results in solutions that are too rigid and overly prescriptive. The solutions do not translate to other systems with different structures.

In previous work, members of our team have taken a different approach. Maat is a policy-based measurement and attestation (M&A) framework which provides a centralized, pluggable service to gather and report integrity measurements [16]. Maat listens for attestation requests and can act as both an appraiser and an attester, depending on the needs of the current scenario. After a request for appraisal is received, the Maat instance on the appraiser system contacts and negotiates with the attesting system's Maat instance to agree upon the set of evidence that must be provided for the scenario. Thus Maat provides a flexible set of capabilities that can be tailored to the needs of any given situation. It is therefore a much more extensible attestation framework.

In early development of Maat, the negotiation was entirely based on a set of well-known UUIDs and was limited in flexibility, especially when Maat instances did not share a core set of measurement capabilities. We discovered that this approach to negotiation severely limited the extensibility of Maat. It is not sufficient to have a flexible set of attestation mechanisms—a flexible language for specifying layered attestations is crucial. This paper introduces such a language.

**Contribution.** We present Copland, a language and formal system for orchestrating layered attestations. Copland provides domain specific syntax for specifying attestation protocols, an operational semantics for guiding implementations, and a denotational semantics for reasoning and negotiation. We designed Copland with Maat in mind aiming to address three main requirements.

First, it must be flexible enough to accommodate the wide diversity of capabilities offered by Maat. Copland is parametric with respect to the basic actions that generate and process evidence (i.e. measurement and bundling). Since we cannot expect all platforms and architectures to have the same set of capabilities, Copland focuses instead on specifying the ways in which these pieces fit together. Copland programs, which we call *phrases* or *terms*, are built out of a small set of operators designed to orchestrate the activities of measurement agents across several layers of a target system.

Second, the language must have an unambiguous execution semantics. We provide a formal, operational semantics allowing a target to know precisely how to manage the flow of control and data throughout the attestation. This operational semantics serves as a correctness constraint for implementations, and generates traces of events that record the order in which actions occurred.

Finally, it must enable static analysis to determine the trust properties guaranteed by alternative phrases. For this purpose we provide a denotational semantics relating phrases to a partially ordered set of events. This semantics is explicitly designed to connect with our prior work on analytic principles of layered attestation [17,18]. By applying those principles in static analysis, both target and appraiser can write policies determining which phrases may be used in which situations based on the trust guarantees they provide.

Critically, we prove a strong connection between the operational execution semantics and the denotational event semantics. We show that any trace generated by the operational semantics is a linearization of the event partial ordering given by the denotational semantics. This ensures that any trust conclusions made from the event partial order are guaranteed to hold over the concrete execution. In particular, our previous work [17,18] characterizes what an adversary must do to avoid detection given a specific partial order of events, identifying strategies to force an adversary to work quickly in short time-of-check-timeof-use windows, or dig deeper into more protected layers of the system. This connection is particularly important in light of the flexibility of the language. Since our basic tenet is that a more constrained language is inherently of less value, it is crucial that we provide a link to analytic techniques that help people distinguish between good and bad ways to perform a layered attestations. We discuss this connection to our previous work in much more detail in Sect. 7.

**Fig. 1.** Semantic relations

Figure 1 depicts the connections among our various contributions. It also provides a useful outline of the paper. Section 3 describes the syntax of Copland corresponding to the apex of the triangle in Fig. 1. Section 4 introduces events. Events are the foundation for both semantic notions depicted in Fig. 1. Each semantic notion constrains the event ordering in its own way. The denotational semantics of the left leg of the triangle is presented in Sect. 5, and the operational semantics of the right leg is given in Sect. 6. The crucial theorem connecting the two semantic notions is sketched in Sect. 7.

All lemmas and theorems stated in this paper have been formally verified using the Coq proof assistant [1]. The Coq proofs are available at https://kusldg.github.io/copland/. The notation used in this paper closely follows the Coq proofs. The tables in Appendix B link figures and formulas with their definitions in the Coq proofs.

Before jumping into the formal details of the syntax and semantics of Copland, however, we present a sequence of simple examples designed to give the reader a feel for the language and its purpose.

### **2 Examples of Layered Attestations**

Consider an example of a corporate gateway that appraises machines before allowing them to join the corporate network. A simple attestation might entail a request for the machine to perform an asset inventory to ensure all software is up-to-date. For purposes of exposition, we may view this as an abstract userspace measurement USM that takes an argument list ¯a<sup>1</sup> of the enterprise software to inventory. We can express a request for a particular target p to perform this measurement with the following Copland phrase:

$$
\mathbb{Q}\_p \mathsf{USM} \ \bar{a}\_1 \tag{1}
$$

This says the measurement capability identifiable as USM should be executed at location identified by p using arguments ¯a1. The request results in evidence of the form Up(ξ) indicating the type of measurement performed, the target of the measurement p, and any previously generated evidence (in this case the empty evidence ξ) it received and combined with the newly generated evidence.

If the company is concerned with the assets in the inventory being undermined by a rootkit in the operating system kernel, it might require additional evidence that no such rootkit exists. This could be done by asking for a kernel integrity measurement KIM to be taken of the place p in addition to the userspace measurement. The request could be made with the following phrase:

$$\uplus\_p (\mathsf{KIM}\ p\ \bar{a}\_2 \stackrel{(\bot,\bot)}{\sim} \mathsf{USM}\ \bar{a}\_1) \tag{2}$$

In this notation, KIM p a¯<sup>2</sup> represents a request for the KIM measurement capability to be applied to the target place <sup>p</sup> with arguments ¯a2. The symbol (-,r) ∼ indicates the two measurements may be taken concurrently. The annotation defines how evidence accumulated so far is transformed for use by the phrase on the left, and r for the one on the right. In the case of (⊥, ⊥), no evidence is sent in either direction. The evidence resulting from the two composed measurements has the form K<sup>p</sup> <sup>p</sup>(ξ) Up(ξ), where indicates the measurements were invoked concurrently.

If the enterprise has configured their machines to have two layers of different privilege levels (say by virtualization), then they may wish to request that the kernel measurement be taken from a more protected location q. This results in the following request.

$$\otimes\_q (\mathsf{KIM}\ p\ \bar{a}\_2 \stackrel{(\perp,\perp)}{\sim} \otimes\_p \mathsf{USM}\ \bar{a}\_1) \tag{3}$$

Notice the kernel measurement target is still the kernel at p, but the request is now being made of the measurement capability located at q. The kernel measurement of p taken from q and the request for p to take a userspace measurement of its own environment can occur concurrently. The resulting evidence has the form K<sup>p</sup> <sup>q</sup> (ξ) Up(ξ), where the subscript q indicates the kernel measurement was taken from the vantage point of q, and the superscript p indicates the location of the kernel measurer's target. The subscript p in the second occurrence of the @ sign indicates that the userspace measurement is taken from location p.

Finally, consider two more changes to the request that makes the evidence more convincing. By measuring the kernel at p *before* the userspace measurement occurs, the appraiser can learn that the kernel was uncompromised at the time of the userspace measurement. This bottom-up strategy is common in approaches to layered attestation [17,22]. Additionally, an appraiser may wish each piece of evidence to be signed as a rudimentary chain of evidence. These can both be specified with the following phrase.

$$\otimes\_q \left( \left( \mathsf{KIM} \not\!\!\mathsf{M} \not\!\!\mathsf{M} \right) \stackrel{(\perp,\perp)}{\prec} \otimes\_p \left( \mathsf{USM} \ \bar{a}\_1 \rightarrow \mathsf{S} \!\!\mathsf{l} \!\!\mathsf{G} \right) \right) \tag{4}$$

In this phrase, the ≺ symbol is used to request that the term on the left complete its execution before starting execution of the term on the right. The → symbol routes data from the term on the left to the term on the right, similar to function composition. In this case evidence coming from KIM and USM is routed to two separate instances of a digital signature primitive. Since these signatures occur at two different locations, they will use two different signing keys. The resulting evidence has the form [[K<sup>p</sup> <sup>q</sup> (ξ)]]<sup>q</sup> ;; [[Up(ξ)]]p, where ;; indicates the evidence was generated in sequence, and the square brackets represent signatures using the private key associated with the location identified by the subscript.

Copland provides a level of flexibility and explicitness that can be leveraged for more than the prescription of the evidence to be gathered. Using this common semantics, appraisers and attesters have the ability to negotiate *specific* measurement agents and targets to utilize to prove integrity. For example, if the measurement requested is computationally intensive, an attester may prefer to provide a cached version of the evidence. The appraiser may be willing to accept this cached version, depending on local policy. In this scenario, a negotiation would take place between the two systems to determine an agreeable set of terms. The appraiser could begin by requesting that Eq. (4) be performed by the target, which would then counter with a different phrase specifying cached instead of fresh measurement. Depending on the implementation, this difference could utilize an entirely separate measurement primitive (e.g., C USM instead of USM) or merely a separate set of arguments to the primitive. The ability to specify the collection of previously generated evidence is especially important when gathering evidence created via a measured boot.

The actions taken to appraise evidence can also be defined by phrases and negotiated before the attestation takes place. If the target is willing to perform a measurement action but doesn't trust the appraiser with the result, the two parties could agree upon a mutually trusted third party to act as the appraiser.

### **3 Phrases**

We begin with the basic syntax of phrases in Copland. Figure 2 defines the grammar of phrases (T) parameterized by atomic actions (A) and the type (E) of evidence they produce when evaluated. Figure 3 defines phrase evaluation. Each phrase specifies what measurements are taken, various operations on evidence, and where measurements and operations are performed. Phrases also specify orderings and dependencies among measurements and operations.

<sup>A</sup> CPY <sup>|</sup> USM <sup>a</sup>¯ <sup>|</sup> KIM <sup>P</sup> <sup>a</sup>¯ <sup>|</sup> SIG <sup>|</sup> HSH |··· T A <sup>|</sup> @<sup>P</sup> <sup>T</sup> <sup>|</sup> (T T) <sup>|</sup> (<sup>T</sup> <sup>π</sup> <sup>≺</sup> <sup>T</sup>) <sup>|</sup> (<sup>T</sup> <sup>π</sup> <sup>∼</sup> <sup>T</sup>) E ξ <sup>|</sup> <sup>U</sup><sup>P</sup> (E) <sup>|</sup> <sup>K</sup><sup>P</sup> <sup>P</sup> (E) <sup>|</sup> [[E]]<sup>P</sup> <sup>|</sup> #<sup>P</sup> <sup>E</sup> <sup>|</sup> (<sup>E</sup> ;; <sup>E</sup>) <sup>|</sup> (<sup>E</sup> <sup>E</sup>) |··· where π = (π1, π2) is a pair of splitting functions.

**Fig. 2.** Phrase and evidence grammar

The atomic phrases either produce evidence via measurement, or transform evidence via computation. Some actions, like USM a¯, perform measurements of their associated place, while others, such as KIM q a¯, measure another place. A *userspace measurement*, USM a¯, measures the local environment. The term @<sup>p</sup> USM a¯ requests that place p perform some measurement USM a¯ of its userspace. Such measurements may range from a simple file hash to complex run time analysis of an application. A *kernel integrity measurement*, KIM q a¯, measures another place. The term @<sup>p</sup> KIM q a¯ requests that p perform a kernel measurement on place q. Such measurements measure one place from another and perform integrity measurements such as LKIM [14]. Starting from a trusted place p, @<sup>p</sup> KIM q a¯ can gather evidence for establishing trust in q and transitively construct chains of trusted enclaves.

The Copland phrase @<sup>p</sup> t corresponds to the essential function of remote attestation—making a request of place p to execute a protocol term t. Places correspond with attestation managers that are capable of responding to attestation requests. Places may be as simple as an IoT device that returns a single value on request or as complicated as a full SELinux installation capable of complex protocol execution.

Evidence produced by @<sup>p</sup> USM a¯ and @<sup>p</sup> KIM q a¯ have types Up(e) and K<sup>q</sup> <sup>p</sup>(e) respectively where p is the place performing measurement, q is the target place, and e is the type of incoming evidence. Place p is obtained from context specified by the @<sup>p</sup> t phrase invoking KIM q a¯. Notice that we work with dependent types.

The phrases (t<sup>1</sup> → t2), (t<sup>1</sup> π ≺ t2), and (t<sup>1</sup> π ∼ t2) specify sequential and parallel composition of subterms. Phrase (t<sup>1</sup> → t2) evaluates two terms in sequence, passing the evidence output by the first term as input to the second term. The phrase (t<sup>1</sup> π ≺ t2) is similar in that the first term runs to completion before the second term begins. It differs in that evidence is not sent from the first term as input to the second term. Instead, each term receives some filtered version of the evidence accumulated thus far from the parent phrase. This evidence is split between the two subterms according to the splitting functions π = (π1, π2) that specify the filter used before passing evidence to each subterm. The resulting evidence has the form (e<sup>1</sup> ;; e2) indicating evidence gathered in sequence. Finally, (t<sup>1</sup> π ∼ t2) specifies its two subterms execute in parallel with data splitting specified by π = (π1, π2). The evidence term (e<sup>1</sup> e2) captures that subterm evaluation occurs in parallel.

Two common filters are identity and empty. *id* e = e returns its argument, producing a copy of the filtered evidence while <sup>⊥</sup> <sup>e</sup> <sup>=</sup> <sup>ξ</sup> always returns empty evidence regardless of input. For example, π = (⊥, ⊥) passes empty evidence to both subterms, π = (⊥, *id*) sends all evidence to the right subterm, and π = (*id*, *id*) sends all evidence to both subterms.

A collection of operator terms specifies various operations over evidence. SIG, HSH, and CPY generate a signature, a hash and a copy of evidence previously gathered. The evidence forms generated by SIG and HSH are [[e]]<sup>p</sup> and #<sup>p</sup> e, respectively. A place identifies itself in a hash by including its identity in the data being hashed. Unlike a cryptographic signature, this serves only to identify the entity performing the hash. It does not provide protection against forgery. Our choice to use hashes in this way is not critical to achieving the Copland design goals. Replacing it with more standard hashes would cause no problem. Other operator terms are anticipated, but these are sufficient for this exposition and for most phrases used in our examples.

$$\begin{aligned} \mathcal{E}(\mathsf{CPY}, p, e) &= e \\ \mathcal{E}(\mathsf{CSM}\,\bar{a}, p, e) &= \mathsf{U}\_{p}(e) \\ \mathcal{E}(\mathsf{KIM}\,q\,\bar{a}, p, e) &= \mathsf{K}\_{p}^{q}(e) \\ \mathcal{E}(\mathsf{SSM}\,p, e) &= [e]\_{p} \\ \mathcal{E}(\mathsf{HSM}\,p, e) &= \#\_{p} e \\ \mathcal{E}(\mathbb{@}\_{q}t, p, e) &= \mathcal{E}(t, q, e) \\ \mathcal{E}(t\_{1} \to t\_{2}, p, e) &= \mathcal{E}(t\_{2}, p, \mathcal{E}(t\_{1}, p, e)) \\ \mathcal{E}(t\_{1} \xleftarrow{\pi} t\_{2}, p, e) &= \mathcal{E}(t\_{1}, p, \pi\_{1}(e)) \mathrel{\;:} \mathcal{E}(t\_{2}, p, \pi\_{2}(e)) \text{ where } \pi = (\pi\_{1}, \pi\_{2}) \\ \mathcal{E}(t\_{1} \xleftarrow{\pi} t\_{2}, p, e) &= \mathcal{E}(t\_{1}, p, \pi\_{1}(e)) \parallel \mathcal{E}(t\_{2}, p, \pi\_{2}(e)) \text{ where } \pi = (\pi\_{1}, \pi\_{2}) \end{aligned}$$

**Fig. 3.** Evidence semantics

### **4 Events**

Events are observable effects associated with phrase execution. For example, a userspace measurement event occurs when a USM term executes; a remote request event occurs when @<sup>p</sup> t begins executing; and a sequence of split and join events occur when the various sequential and parallel composition terms execute. The events resulting from executing a phrase characterize that phrase.

The events associated with a subphrase t<sup>1</sup> within phrase t<sup>0</sup> is determined by the position in t<sup>0</sup> at which the subphrase occurs. For example, the term (t → t) has two occurrences of t that will be associated with some events. It is essential that the set of events associated with the left occurrence is disjoint from the set of events associated with the right occurrence. For this reason, each event has an associated natural number that is unique to that event.


#### **Fig. 4.** Annotated terms

Annotated terms enable the generation of a unique number for each event in the Coq proofs. An annotated term, [t] j <sup>i</sup> , adds bounds, i and j to term t, where i and j are natural numbers. By construction each event related to [t] j <sup>i</sup> has a unique natural number k such that i ≤ k<j. The set of all annotated terms is defined by T¯ = -∞ i,j=0 <sup>T</sup><sup>j</sup> <sup>i</sup> , where <sup>T</sup><sup>j</sup> <sup>i</sup> is defined in Fig. 4. The number of events associated with [t] j <sup>i</sup> is j − i.

As examples, two terms from T¯ are:

$$[[\mathsf{KIM}\ p\ \bar{a}]\_0^1 \to [\mathsf{Sl}\mathsf{G}]\_1^2]\_0^2 \qquad\qquad [\mathbb{\bar{\otimes}}\_p [\mathsf{U}\mathsf{S}\mathsf{M}\ \bar{a}]\_1^2]\_0^3$$

The annotations on KIM and SIG indicate that the event associated with KIM is numbered 0 while the event associated with SIG is numbered 1. The entire sequence term includes numbers for both KIM and SIG. Similarly the @<sup>p</sup> USM a¯ term allocates the number 1 for USM, and adds 0 and 2 for a request and reply event respectively associated with @<sup>p</sup> t. For details of annotation generation, see Fig. 9 in Appendix A, which presents a simple function that translates terms into annotated terms.

Figure 5 presents event syntax while Fig. 6 relates phrases to events. The relation between annotated term t, place p, evidence e, and the associated event v, is written t p <sup>e</sup> v. Given some term t and current evidence e in place p, t p <sup>e</sup> v relates event v to t in p. Note that each event has a natural number whose purpose is to uniquely identify the event as required by the Coq proofs.

```
V CPY(N, P, E) | USM(N, P, L, E, E) | KIM(N, P, L, E, E)
   | SIG(N, P, E, E) | HSH(N, P, E, E) | REQ(N, P, P, E)
   | RPY(N, P, P, E) | SPLIT(N, P, E, E, E) | JOIN(N, P, E, E, E)
```
**Fig. 5.** Event grammar

[CPY] i+1 <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> CPY(i, p, e) [USM a¯] i+1 <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> USM(i, p, a, e, ¯ Up(e)) [KIM q a¯] i+1 <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> KIM(i, p, a, e, ¯ K<sup>q</sup> <sup>p</sup>(e)) [SIG] i+1 <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> SIG(i, p, e, [[e]]p) [HSH] i+1 <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> HSH(i, p, e, #<sup>p</sup> e) [@<sup>q</sup> t] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> REQ(i, p, q, e) [@<sup>q</sup> t] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t ✸<sup>q</sup> <sup>e</sup> v [@<sup>q</sup> t] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> RPY(<sup>j</sup> <sup>−</sup> <sup>1</sup>, p, q, <sup>E</sup>¯(t, q, e)) [t<sup>1</sup> t2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>1</sup> ✸<sup>p</sup> <sup>e</sup> v [t<sup>1</sup> t2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>2</sup> ✸<sup>p</sup> <sup>E</sup>¯(t1,p,e) <sup>v</sup> [t1 π <sup>≺</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> SPLIT(i, p, e, π1(e), π2(e)) [t1 π <sup>≺</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>1</sup> ✸<sup>p</sup> <sup>π</sup>1(e) v [t1 π <sup>≺</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>2</sup> ✸<sup>p</sup> <sup>π</sup>2(e) v [t1 π <sup>≺</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> JOIN(<sup>j</sup> <sup>−</sup> <sup>1</sup>, p, e1, e2, e<sup>1</sup> ;; <sup>e</sup>2) where <sup>e</sup><sup>1</sup> <sup>=</sup> <sup>E</sup>¯(t1, p, π1(e)) and <sup>e</sup><sup>2</sup> <sup>=</sup> <sup>E</sup>¯(t2, p, π2(e)) [t1 π <sup>∼</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> SPLIT(i, p, e, π1(e), π2(e)) [t1 π <sup>∼</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>1</sup> ✸<sup>p</sup> <sup>π</sup>1(e) v [t1 π <sup>∼</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> v if t<sup>2</sup> ✸<sup>p</sup> <sup>π</sup>2(e) v [t1 π <sup>∼</sup> <sup>t</sup>2] j <sup>i</sup> ✸<sup>p</sup> <sup>e</sup> JOIN(<sup>j</sup> <sup>−</sup> <sup>1</sup>, p, e1, e2, e<sup>1</sup> <sup>e</sup>2) where <sup>e</sup><sup>1</sup> <sup>=</sup> <sup>E</sup>¯(t1, p, π1(e)) and <sup>e</sup><sup>2</sup> <sup>=</sup> <sup>E</sup>¯(t2, p, π2(e))

**Fig. 6.** Events of terms

Each atomic term has exactly one associated event that records execution details of the term including resulting evidence. Each @<sup>p</sup> t term is associated with a request event, a reply event, and the events associated with term t. Each (t<sup>1</sup> → t2) term is associated with the events of its subterms. Both (t<sup>1</sup> π ≺ t2) and (t<sup>1</sup> π ∼ t2) are associated with the events of their subterms as well as a split and a join event. The evidence function <sup>E</sup>¯ is the same as <sup>E</sup> except it applies to annotated terms instead of terms.

Essential properties of the annotations are expressed in Lemmas 1–3. In each lemma, let ι be a projection from an event to its number.

**Lemma 1.** [t] j <sup>i</sup> p <sup>e</sup> v *implies* i ≤ ι(v) < j*.*

Each event associated with a term has a number in the range of the term's annotation. This is critical to the way that subterm annotations are composed to form term annotations.

**Lemma 2.** <sup>t</sup> p <sup>e</sup> v<sup>1</sup> *and* t p <sup>e</sup> v<sup>2</sup> *and* ι(v1) = ι(v2) *implies* v<sup>1</sup> = v2*.*

Event numbers are unique to events. If two events have the same number, they must be the same event.

**Lemma 3.** i ≤ k<j *implies for some* v*,* [t] j <sup>i</sup> p <sup>e</sup> v *and* ι(v) = k*.*

There is an event associated with every number in an annotation range. There are no unassigned numbers in the range of an annotation.

### **5 Partial Order Semantics**

The previous mapping of phrases to evidence types defines a denotational semantics for evaluation. The t p <sup>e</sup> v relation defines visible events that result when a phrase executes. Here we add a partial order to define correct orderings of events associated with an execution. In Definition 5, we define strict partial order R(t, p, e) over the set {v | t p <sup>e</sup> v}, for some term t, place p, and initial evidence e. It defines requirements on any event trace produced by evaluating t at p with e.

The relation R(t, p, e) is defined by first introducing a language for representing strict partial orders, then representing semantics of language terms as event partial orders. The grammar defining the objects used to represent strict partial orders is

$$(O \leftarrow V \mid (O \rhd O) \mid (O \bowtie O) .$$

Events are ordered with the precedes relation. We write o : v ≺ v when event v *precedes* another v in partial order o. We write v ∈ o if event v occurs in o.

**Definition 4 (Precedes).** o : v ≺ v *is the smallest relation such that:*

```
1. o = o1  o2 implies v ∈ o1 and v ∈ o2 or o1 : v ≺ v or o2 : v ≺ v
2. o = o1  o2 implies o1 : v ≺ v or o2 : v ≺ v
```
The set of events associated with o is the set {v | v ∈ o}, and o represents the poset that orders that set.

If o<sup>1</sup> and o<sup>2</sup> represent disjoint posets, then o<sup>1</sup> o<sup>2</sup> represents the poset that respects the orders in o<sup>1</sup> and o<sup>2</sup> and for which every event in o<sup>1</sup> is before every event in o2. Therefore, is called the *before* operator. Additionally, o<sup>1</sup> o<sup>2</sup> represents the poset which simply retains the orders in both o<sup>1</sup> and o2, and so is called the *merge* operator. When applied to mutually disjoint posets, and are associative.

#### **Definition 5 (Strict Partial Order)**

$$\mathcal{R}(t, p, e)(v, v') = \mathcal{V}(t, p, e) : v \prec v'$$

*where* V(t, p, e) *is defined in Fig. 7.*

The definition of V(t, p, e) is carefully crafted so that the posets combined by and are disjoint.

For the phrase @<sup>q</sup> USM a¯, the strict partial order term starting with 0 is

**Example 6.** V([@<sup>q</sup> [USM a¯] 2 1] 3 <sup>0</sup>, p, e) = REQ(0,...) USM(1,...) RPY(2,...).

$$\begin{array}{lcl} & \mathsf{V}([\mathsf{CPY}]\_{i}^{j+1},p,e) = \mathsf{CPY}(i,p,e) \\ & \mathsf{V}([\mathsf{MSM}\ \vec{a}]\_{i}^{j+1},p,e) = \mathsf{MSM}(i,p,\overline{a},e,\mathsf{U}\_{p}(e)) \\ & \mathsf{V}([\mathsf{KM}\ \vec{a}]\_{i}^{j+1},p,e) = \mathsf{KM}(i,p,\overline{a},e,\mathsf{K}\_{p}^{j}(e)) \\ & \mathsf{V}([\mathsf{SM}\!\vec{a}]\_{i}^{j+1},p,e) = \mathsf{SM}(i,p,e,\mathsf{I}\_{p}^{j}|e) \\ & \mathsf{V}([\mathsf{MS}\!\vec{a}]\_{i}^{j+1},p,e) = \mathsf{KM}(i,p,e,\mathsf{I}\_{p}^{j},e) \\ & \mathsf{V}([\mathsf{@}\_{i}\ \vec{a}]\_{i}^{j},p,e) = \mathsf{RE}(i,p,q,e) \rhd \mathsf{V}(t,q,e) \rhd \mathsf{RPY}(j-1,p,q,\vec{\mathcal{E}}(t,q,e)) \\ & \mathsf{V}([t\_{1}\to t\_{2}]\_{i}^{j},p,e) = \mathsf{V}(t\_{1},p,e) \rhd \mathsf{V}(t\_{2},p,\vec{\mathcal{E}}(t\_{1},p,e)) \\ & \mathsf{V}([t\_{1}\xrightarrow{\pi}t\_{2}]\_{i}^{j},p,e) = \mathsf{SP}\mathsf{LI}(i,p,e,\mathsf{I}\_{1}(e),\mathsf{\pi}\_{2}(e)) \rhd \mathsf{V}(t\_{1},p,\mathsf{\pi}\_{1}(e)) \rhd \mathsf{V}(t\_{2},p,\mathsf{\pi}\_{2}(e)) \rhd \\ & \qquad \qquad \mathsf{V}([t\_{1}\xrightarrow{\pi}t\_{2}]\_{i}^{j$$

#### **Fig. 7.** Event semantics

The R(t, p, e) relation is verified to be both irreflexive and transitive, demonstrating it is a strict partial order.

**Lemma 7 (Irreflexive).** ¬V(t, p, e) : v ≺ v*.*

**Lemma 8 (Transitive).** V(t, p, e) : v<sup>1</sup> ≺ v<sup>2</sup> *and* V(t, p, e) : v<sup>2</sup> ≺ v<sup>3</sup> *implies* V(t, p, e) : v<sup>1</sup> ≺ v3*.*

Evaluating t is shown to include v if and only if v is associated with t. This ensures that all events associated with t are accounted for in the evaluation relation and that the evaluation relation does not introduce events not associated with t. Thus R(t, p, e) is a strict partial order for the set {v | t p <sup>e</sup> v}.

**Lemma 9 (Correspondence).** <sup>v</sup> ∈ V(t, p, e) *iff* <sup>t</sup> p <sup>e</sup> v*.*

Figure 7 defines event semantics in terms of the term being processed, the place managing execution, and the initial evidence. Measurement terms and evidence operations trivially translate into their corresponding atomic events whose output is the corresponding measurement or calculated result.

Simple sequential execution t = (t<sup>1</sup> → t2) is defined using the canonical method where output evidence from the first operation is used as input to the second. The before operator () ensures that all events from t<sup>1</sup> complete in the order specified by R(t, p, e) before events from t<sup>2</sup> start. Note the appearance of evidence semantics in the definition to calculate event output in the canonical fashion.

Sequential execution with data splitting t = (t<sup>1</sup> π ≺ t2) is defined by again using the before operator to ensure t<sup>1</sup> events complete as specified by R(t, p, e) before events from t<sup>2</sup> begin. The distinction from simple sequential execution is using π<sup>1</sup> and π<sup>2</sup> from π to split evidence between t<sup>1</sup> and t2. The SPLIT event routes evidence to t<sup>1</sup> and t<sup>2</sup> while JOIN composes results indicating sequential execution.

Parallel execution with data splitting (t<sup>1</sup> π ∼ t2) is defined using split and join events. Again π<sup>1</sup> and π<sup>2</sup> determine how evidence is routed to the composed posets. The merge operator () specifies parallel composition while respecting the orders specified for t<sup>1</sup> and t2. The final operator ensures that both posets are ordered before JOIN.

The @<sup>p</sup> t operation responsible for making requests of other places is defined using communication events. The protocol term @<sup>q</sup> t evaluated by p results in an event poset where: (i) p and q synchronize on a request for q to perform t; (ii) q runs t; (iii) p and q synchronize on the reply back to p sending the resulting evidence. The before operator () ensures that each sequential step completes before moving to the next.

**Definition 10.** *The* output evidence *associated with an event is the right-most evidence used to construct the event.*

**Lemma 11.** V(t, p, e) *always has a unique maximal event* emax*, and the output of* <sup>e</sup>max *is* <sup>E</sup>¯(t, p, e)*.*

Lemma 11 shows that evaluating a term with the evidence semantics of Fig. 3 produces the same evidence as evaluating the same term with the event semantics of Fig. 7. Every annotated term has a unique maximal event as defined by V(t, p, e) implying that each finite sequence of events must have a last event. The evidence associated with that maximal event represents evidence produced by any event sequence satisfying the partial order. Additionally, that evidence is equal to the evidence produced by <sup>E</sup>¯(t, p, e) for the same term, place and evidence. Lemma 11 proves that evaluating t in place p results in the same evidence using both the evidence and event semantics. Specifically, that <sup>E</sup>¯(t, p, e) and V(t, p, e) are weakly bisimilar, producing the same result.

### **6 Small-Step Semantics**

The small-step semantics for Copland is defined as a labeled transition system whose states represent protocol execution states and whose labels represent events interacting with the execution environment. The single-step transition relation is s1 - s2, where s<sup>1</sup> and s<sup>2</sup> are states and is either an event or τ denoting a silent transition. The transition s<sup>1</sup> - s<sup>2</sup> says that a system in state s<sup>1</sup> will transition in one step to state s<sup>2</sup> engaging in the observable event, v, or no event when = τ . The relation s<sup>1</sup> c <sup>∗</sup> s<sup>2</sup> is the reflexive, transitive closure of the single-step relation. c is called an event trace and is the sequence of events resulting from each state transition. The transition s<sup>1</sup> c <sup>∗</sup> s<sup>2</sup> says that a system in state s<sup>1</sup> will transition to state s<sup>2</sup> in zero or more steps engaging in the event sequence c.

The grammar defining the set of states, S, is

$$\begin{array}{c} S \leftarrow \mathcal{C}(\bar{T}, P, E) \mid \mathcal{D}(P, E) \mid \mathcal{A}(\mathbb{N}, P, S) \mid \mathcal{L}\mathcal{S}(S, \bar{T})\\ \mid \; \mathcal{B}\mathcal{S}^{\ell}(\mathbb{N}, S, \bar{T}, P, E) \mid \; \mathcal{B}\mathcal{S}^{r}(\mathbb{N}, E, S) \mid \; \mathcal{B}\mathcal{P}(\mathbb{N}, S, S), \end{array}$$

where P is the syntactic category for places, E is for evidence, and T¯ is for annotated terms. The transition relation for phrases is presented in Fig. 8.

State C(t, p, e) is a configuration state defining the start of evaluating t at p with initial evidence e. Its complement is the stop state D(p, e ) defining the end of evaluation in p with final evidence e . Assertion <sup>C</sup>(t, p, e) <sup>c</sup> <sup>∗</sup> D(p, e ) represents evaluating t at p resulting in evidence e and event trace c.

A configuration for an atomic term transitions in one step to a done state containing measured or computed evidence after executing an event. For example, the state C([USM a¯] i+1 <sup>i</sup> , p, e) transitions to D(p,Up(e)) after the single event USM(i, p, a, e, ¯ Up(e)) performs the represented measurement. Similarly, the state C([CPY] i+1 <sup>i</sup> , p, e) transitions to D(p, e) after the single event CPY(i, p, e) copies the evidence.

The state A(j − 1, p, s) occurs while evaluating an [@<sup>q</sup> t] j <sup>i</sup> term and is used to remember the number to be used to construct a reply event and the place to send the result of evaluating t at q after the reply event. A configuration state C(@<sup>q</sup> t, p, e) starts the evaluation of @<sup>q</sup> t by p and transitions immediately to A(j − 1, p, C(t, q, e)) after executing the request event REQ(i, p, q, e). The nested state C(t, q, e) represents remote term execution. Evaluation proceeds with A(j − 1, p, s) transitioning to A(j − 1, p, s ) when s <sup>v</sup> s . Any event v associated with s <sup>v</sup> <sup>s</sup> is also associated with the transition <sup>A</sup>(<sup>j</sup> <sup>−</sup> <sup>1</sup>, p, s) <sup>v</sup> A(j − 1, p, s ) and will contribute to the trace. When a state A(j −1, p, D(q, e )) results, remote execution completes and the result of q evaluating t as requested by p is D(p, e ) after event RPY(j − 1, p, q, e ).

The state LS(s1, t2) is associated with evaluating (t<sup>1</sup> → t2). State s<sup>1</sup> represents the current state of term t<sup>1</sup> and t<sup>2</sup> is the second term waiting for evaluation. The state C([t<sup>1</sup> → t2] j <sup>i</sup> , p, s) transitions to LS(C(t1, p, e), t2) representing t<sup>1</sup> ready for evaluation and t<sup>2</sup> waiting. The annotation is ignored in this transition because the transitions are silent. Subsequent transitions evaluate C(t1, p, e) until reaching state LS(D(p, e1), t2) after producing event trace v1. This state silently transitions to C(t2, p, e1) configuring t<sup>2</sup> for evaluation using e<sup>1</sup> as initial evidence. t<sup>2</sup> evaluates in a similar fashion resulting in e<sup>2</sup> and trace v2. State D(p, e2) is the final state with e<sup>2</sup> as evidence having engaged in the concatenation of v<sup>1</sup> and v2, v<sup>1</sup> ∗ v2.

**For atomic terms:**

$$\begin{split} \mathcal{C}([\mathsf{CPY}]\_{i}^{i+1},p,e) \stackrel{v}{\sim} \mathcal{D}(p,e) & \qquad [v = \mathsf{CPY}(i,p,e)] \\ \mathcal{C}([\mathsf{USM}\ \vec{a}]\_{i}^{i+1},p,e) \stackrel{v}{\sim} \mathcal{D}(p,\mathsf{U}\_{p}(e)) & \qquad [v = \mathsf{USM}(i,p,\overline{a},e,\mathsf{U}\_{p}(e))] \\ \mathcal{C}([\mathsf{KIM}\ q\ \vec{a}]\_{i}^{i+1},p,e) \stackrel{v}{\sim} \mathcal{D}(p,\mathsf{K}\_{p}^{q}(e)) & \qquad [v = \mathsf{KIM}(i,p,\overline{a},e,\mathsf{K}\_{p}^{q}(e))] \\ \mathcal{C}([\mathsf{SIG}]\_{i}^{i+1},p,e) \stackrel{v}{\sim} \mathcal{D}(p,\{e\}\_{p}) & \qquad [v = \mathsf{SIG}(i,p,e,\{e\}\_{p})] \\ \mathcal{C}([\mathsf{HSM}]\_{i}^{i+1},p,e) \stackrel{v}{\sim} \mathcal{D}(p,\#\_{p}e) & \qquad [v = \mathsf{HSH}(i,p,e,\#\_{p}e)] \end{split}$$

$$\begin{aligned} \textbf{For} \quad \text{@}\_{q} \ t \text{:}\\ \mathcal{C}([\![\otimes\_{q}t]\!]\_{i}^{j},p,e) \xrightarrow{\upsilon} \mathcal{A}(j-1,p,\mathcal{C}(t,q,e)) \qquad & [\upsilon = \mathsf{REq}(i,p,q,e)]\\ \mathcal{A}(i,p,s\_{1}) \xrightarrow{\upsilon} \mathcal{A}(i,p,s\_{2}) \qquad & \qquad \text{if } s\_{1} \xrightarrow{\upsilon} s\_{2}\\ \mathcal{A}(i,p,\mathcal{D}(q,e)) \xrightarrow{\upsilon} \mathcal{D}(p,e) \qquad & [\upsilon = \mathsf{RPY}(i,p,q,e)] \end{aligned}$$

$$\mathbf{For } t\_1 \to t\_2;$$

$$\begin{aligned} \mathcal{C}([t\_1 \to t\_2]\_i^j, p, e) &\stackrel{\tau}{\leadsto} \mathcal{L}\mathcal{S}(\mathcal{C}(t\_1, p, e), t\_2) \\ \mathcal{L}\mathcal{S}(s\_1, t\_2) &\stackrel{v}{\leadsto} \mathcal{L}\mathcal{S}(s\_2, t\_2) \\ \mathcal{L}\mathcal{S}(\mathcal{D}(p, e), t) &\stackrel{\tau}{\leadsto} \mathcal{C}(t, p, e) \end{aligned} \qquad \text{if } s\_1 \stackrel{v}{\leadsto} s\_2$$

$$\begin{split} \textbf{For } t\_1 \stackrel{s}{\smile} t\_2; \\ \mathcal{L}([t\_1 \stackrel{s}{\smile} t\_2]\_i^j, p, e) \stackrel{v}{\smile} \mathcal{BS}^\ell(j-1, \mathcal{C}(t\_1, p, \pi\_1(e)), t\_2, p, \pi\_2(e)) \\ \quad & [v = \textsf{SPLIT}(i, p, e, \pi\_1(e), \pi\_2(e))] \\ \mathcal{BS}^\ell(i, s\_1, t, p, e) \stackrel{v}{\smile} \mathcal{BS}^\ell(i, s\_2, t, p, e) \qquad \text{if } s\_1 \stackrel{v}{\smile} s\_2 \\ \mathcal{BS}^\ell(i, \mathcal{D}(p, e), t, p', e') \stackrel{v}{\smile} \mathcal{BS}^\ell(i, e, \mathcal{C}(t, p', e')) \\ \mathcal{BS}^\ell(i, e, s\_1) \stackrel{v}{\smile} \mathcal{BS}^\ell(i, e, s\_2) \qquad \text{if } s\_1 \stackrel{v}{\smile} s\_2 \\ \mathcal{BS}^\ell(i, e\_1, \mathcal{D}(p, e\_2)) \stackrel{v}{\smile} \mathcal{D}(p, e\_1; e\_2) \qquad [v = \textsf{JON}(i, p, e\_1, e\_2, e\_1; e\_2)] \\ \quad \text{From } t\_i \stackrel{s}{\smile} t\_2. \end{split}$$

$$\begin{aligned} \text{For } t\_1 \xleftarrow{} t\_2; \\ \mathcal{C}([t\_1 \xleftarrow{} t\_2]\_i^j, p, e) \xleftarrow{v} \mathcal{BP}(j-1, \mathcal{C}(t\_1, p, \pi\_1(e)), \mathcal{C}(t\_2, p, \pi\_2(e))) \\ \text{if } & [v = \mathsf{SPLT}(i, p, e, \pi\_1(e), \pi\_2(e))] \\ \mathcal{BP}(i, s\_1, s\_1) \xleftarrow{v} \mathcal{BP}(i, S, s\_2) \\ \mathcal{BP}(i, \mathcal{D}(p, e\_1), \mathcal{D}(p, e\_2)) \xleftarrow{v} \mathcal{BP}(i, s\_2, S) \\ \end{aligned} \quad \text{if } s\_1 \xleftarrow{v} s\_2$$
 
$$\mathcal{BP}(i, \mathcal{D}(p, e\_1), \mathcal{D}(p, e\_2)) \xleftarrow{v} \mathcal{D}(p, e\_1 \parallel e\_2) \quad \quad [v = \mathsf{JOIN}(i, p, e\_1, e\_2, e\_1 \parallel e\_2)]$$

#### **Fig. 8.** Labeled transition system

States BS- (<sup>j</sup> <sup>−</sup> <sup>1</sup>, s, t, p, e) and BS<sup>r</sup>(<sup>j</sup> <sup>−</sup> <sup>1</sup>, e, s) are associated with evaluating the left and right subterms of [t<sup>1</sup> π ≺ t2] j <sup>i</sup> respectively. Recall that t<sup>1</sup> π ≺ t<sup>2</sup> differs from t<sup>1</sup> → t<sup>2</sup> because the initial evidence for t<sup>1</sup> π ≺ t<sup>2</sup> is split between t<sup>1</sup> and t<sup>2</sup> and the resulting evidence is the sequential composition of evidence from t<sup>1</sup> and t2. The configuration state C([t<sup>1</sup> π ≺ t2] j <sup>i</sup> , p, e) transitions immediately to BS- (j − 1, C(t1, p, π1(e)), t2, p, π2(e)) after the split event SPLIT(i, p, e, π1(e), π2(e)), where π = (π1, π2). This state captures the initial configuration of t<sup>1</sup> ready to evaluate with evidence π1(e) along with t<sup>2</sup> waiting to execute with evidence π2(e) after t<sup>1</sup> completes. Evaluation proceeds with state BS- (<sup>j</sup> <sup>−</sup> <sup>1</sup>, s, t2, p, π2(e)) transitioning to BS- (j − 1, s , t2, p, π2(e)) after event v when s <sup>v</sup> s . After one or more such transitions a state BS- (j−1, D(p, e <sup>1</sup>), t, p, e2) is reached after event sequence v<sup>1</sup> indicating that evaluating <sup>t</sup><sup>1</sup> has ended and <sup>t</sup><sup>2</sup> should begin. This state transitions to BS<sup>r</sup>(j−1, e <sup>1</sup>, s) with s initially C(t2, p, π2(e)) and e <sup>1</sup> being the evidence from t1. This state will transition repeatedly until a state BS<sup>r</sup>(<sup>j</sup> <sup>−</sup> <sup>1</sup>, e <sup>1</sup>, D(p, e <sup>2</sup>)) results after trace v<sup>2</sup> representing completion of t2. Both t<sup>1</sup> and t<sup>2</sup> are complete with evidence e <sup>1</sup> and e <sup>2</sup> and evidence must be composed. The final state transitions to D(p, e<sup>1</sup> ;; e2) after the join event JOIN(<sup>j</sup> <sup>−</sup> <sup>1</sup>, p, e1, e2, e<sup>1</sup> ;; <sup>e</sup>2) where <sup>e</sup><sup>n</sup> <sup>=</sup> <sup>E</sup>¯(tn, p, πn(e)).

State BP(j − 1, s1, s2) is associated with parallel evaluation of t<sup>1</sup> and t2. The configuration state C([t<sup>1</sup> π ∼ t2] j <sup>i</sup> , p, e) immediately transitions to BP(j − 1, C(t1, p, π1(e)), C(t2, p, π2(e))) after the split event SPLIT(i, p, e, π1(e), π2(e)). Note that in the state BP(j−1, C(t1, p, π1(e)), C(t2, p, π2(e))) configuration states for both t<sup>1</sup> and t<sup>2</sup> can evaluate. More generally in any state BP(j − 1, s1, s2) evaluating either s<sup>1</sup> and s<sup>2</sup> may cause the state to transition. When evaluation reaches a term of the form BP(j − 1, D(p, e <sup>1</sup>), D(p, e <sup>2</sup>)) both term evaluations are complete. This final state transitions to D(p, e<sup>1</sup> e2) after the join event JOIN(j − 1, p, e1, e2, e<sup>1</sup> e2).

We prove Correctness, Progress, and Termination with respect to this transition system. Correctness defines congruence between the small-step operational semantics and the denotational evidence semantics. Specifically, if the multi-step evaluation relation maps state C(t, p, e) to D(p, e ) then <sup>E</sup>¯(t, p, e) = <sup>e</sup> .

#### **Lemma 12 (Correctness).** *If* <sup>C</sup>(t, p, e) <sup>c</sup> <sup>∗</sup> D(p, e ) *then* <sup>E</sup>¯(t, p, e) = <sup>e</sup> *.*

Progress states that every state is either a stop state of the form D(p, e) or it can be evaluated. With the Progress lemma we know that there exist no "stuck" states in the operational semantics.

**Lemma 13 (Progress).** *Either* s<sup>1</sup> = D(p, e) *for some* p *and* e *or* s<sup>1</sup> v s<sup>2</sup> *for some* v *and* s2*.*

Termination states that any configuration state will transition to a done state of the form D(p, e) in a finite number of steps. This is a strong condition that assures evaluation of any well-formed term will terminate.

**Lemma 14 (Termination).** *For some* <sup>n</sup>*,* <sup>C</sup>(t, p, e) <sup>c</sup> *<sup>n</sup>* D(p, e ).

### **7 Proof Summary**

The ordering of events is a critically important property of attestation systems. Even when measurement events properly execute individually, their ordering is what establishes trust chains. If a component performs measurement before being measured, any trust in that component and subsequent components is lost.

Figure 1 shows phrases denoted as event posets and defined operationally as a labeled transition system. The event posets define legal orderings of events in traces while the LTS defines traces associated with phrase evaluation. The remaining theoretical result is proving that the small-step semantics produces traces compatible with the partial order semantics.

To present event sequences we use the classical notation v1, v2,...,v<sup>n</sup> for sequence construction and c ↓ i to select the i th element from sequence c. The concatenation of c<sup>1</sup> and c<sup>2</sup> is c<sup>1</sup> ∗ c2. Event v is *earlier* than event v in trace c, written v <sup>c</sup> v , iff there exists an i and j such that i<j and c ↓ i = v and c ↓ j = v .

The main correctness theorem states that if some term t evaluates to evidence e after trace c and two events v and v from c are ordered by the event semantics, then that order is guaranteed in c. Said differently, if the event semantics constrains two events, then the small-step LTS semantics respects that constraint. This theorem is stated formally in Theorem 15.

**Theorem 15 (Correctness).** *If* <sup>C</sup>(t, p, e) <sup>c</sup> <sup>∗</sup> D(p, e ) *and* V(t, p, e) : v ≺ v *, then* v <sup>c</sup> v *.*

The proof is done in two steps using a big-step semantics defining traces for individual phrases as an intermediary. The inductive structure of the bigstep semantics more closely matches the inductive structure of the partial order semantics, easing the proofs about the relation between the two.

The intermediate big-step semantics is specified as a relation between annotated term <sup>t</sup>, place <sup>p</sup>, evidence <sup>e</sup>, and trace <sup>c</sup>, written <sup>t</sup> <sup>p</sup> <sup>e</sup> c. The structure of the definition is similar to the structure of the relation in Fig. 6. Most cases of the definition are straightforward event sequences taken from the small-step semantics.

For atomic actions, the associated sequence is a single event implementing the action. As an illustrative example, USM a¯ is associated with

$$\langle \mathsf{USM} \,\bar{a} \rangle\_i^{i+1} \,\Box\_e^p \langle \mathsf{USM}(i, p, \bar{a}, e, \mathsf{U}\_p(e)) \rangle.$$

For remote actions, @<sup>q</sup> t, the associated trace starts with a request event followed by the trace c executed remotely and ending with a reply event:

$$\langle [\mathbb{\Omega}\_q \, t]\_i^j \, \Box\_e^p \langle \mathsf{R} \mathsf{E} \mathsf{Q}(i, p, q, e) \rangle \ast c \ast \langle \mathsf{R} \mathsf{P} \mathsf{Y}(j - 1, p, q, \bar{\mathcal{E}}(t, q, e)) \rangle \qquad \text{if } t \Box\_e^q c.$$

For sequential actions, (t<sup>1</sup> → t2), the associated trace starts with the trace c<sup>1</sup> associated with t<sup>1</sup> and ends with the trace c<sup>2</sup> associated with t<sup>2</sup> starting with evidence e<sup>1</sup> from c1:

> [t<sup>1</sup> → t2] j <sup>i</sup> <sup>p</sup> <sup>e</sup> <sup>c</sup><sup>1</sup> <sup>∗</sup> <sup>c</sup><sup>2</sup> if <sup>t</sup><sup>1</sup> <sup>p</sup> <sup>e</sup> <sup>c</sup><sup>1</sup> and <sup>t</sup><sup>2</sup> <sup>p</sup> <sup>e</sup><sup>1</sup> c2,

where <sup>e</sup><sup>1</sup> <sup>=</sup> <sup>E</sup>¯(t1, p, e).

For sequential branching, (t<sup>1</sup> π ≺ t2), the associated trace starts with a split event and continues with trace c<sup>1</sup> associated with t<sup>1</sup> starting with π1(e) followed by trace c<sup>2</sup> associated with t<sup>2</sup> starting with π2(e):

$$[(t\_1 \stackrel{\pi}{\prec} t\_2)]\_i^j \Box\_e^p \langle v\_1 \rangle \ast c\_1 \ast c\_2 \ast \langle v\_2 \rangle \qquad \text{if } t\_1 \Box\_{\pi\_1(e)}^p c\_1 \text{ and } t\_2 \Box\_{\pi\_2(e)}^p c\_2,$$

where

$$\begin{array}{l} v\_1 = \mathsf{SPL}\mathsf{IT}(i, p, e, \pi\_1(e), \pi\_2(e)) \\ v\_2 = \mathsf{JOLN}(j-1, p, e\_1, e\_2, e\_1; e\_2) \\ e\_1 = \bar{\mathcal{E}}(t\_1, p, \pi\_1(e)) \\ e\_2 = \bar{\mathcal{E}}(t\_2, p, \pi\_2(e)). \end{array}$$

The case for parallel branching, (t<sup>1</sup> π ∼ t2), requires additional work to capture parallel execution semantics using trace interleaving. We write *il*(c, c , c) to assert that trace c is a result of interleaving c with c.

#### **Definition 16 (Interleave).** *il*(c, c , c) *is the smallest relation such that*

*1. il*(c,, c) *and il*(c, c,)*; 2. il*(v ∗ c,v ∗ c , c) *if il*(c, c , c)*; and 3. il*(v ∗ c, c ,v ∗ c) *if il*(c, c , c)*.*

When c is an interleaving of c and c, v<sup>1</sup> <sup>c</sup> v<sup>2</sup> implies v<sup>1</sup> <sup>c</sup> v<sup>2</sup> and v<sup>1</sup> <sup>c</sup> v<sup>2</sup> implies v<sup>1</sup> <sup>c</sup> v2, but the order of events in c is otherwise unconstrained.

With interleaving defined, the trace for (t<sup>1</sup> π ∼ t2) begins with a split operation and continues with an interleaving of c<sup>1</sup> and c<sup>2</sup> associated with t<sup>1</sup> and t<sup>2</sup> starting with π1(e) and π2(e) respectively. The trace ends with a join event when both interleaved traces end:

$$[t\_1 \stackrel{\pi}{\sim} t\_2]\_i^j \Box\_e^p \langle v\_1 \rangle \ast c \ast \langle v\_2 \rangle \quad \text{if } t\_1 \Box\_{\pi\_1(e)}^p c', \, t\_2 \Box\_{\pi\_2(e)}^p c'', \text{ and } il(c, c', c''),$$

where

$$\begin{array}{l} v\_1 = \mathsf{SPLT}(i, p, e, \pi\_1(e), \pi\_2(e)) \\ v\_2 = \mathsf{JOLN}(j-1, p, e\_1, e\_2, e\_1 \parallel e\_2) \\ e\_1 = \vec{\mathcal{E}}(t\_1, p, \pi\_1(e)) \\ e\_2 = \vec{\mathcal{E}}(t\_2, p, \pi\_2(e)). \end{array}$$

The following two lemmas show that every trace in the big-step semantics contains the correct events. Lemma 17 asserts that the right number of events occurs and Lemma 18 asserts that all events do in fact occur in the trace.

**Lemma 17.** [t] j <sup>i</sup> <sup>p</sup> <sup>e</sup> c *implies the length of* c *is* j − i*.*

**Lemma 18.** <sup>t</sup> <sup>p</sup> <sup>e</sup> c *implies* t p <sup>e</sup> v *iff for some* i*,* v = c ↓ i*.*

The first step in the proof of Theorem 15 is to show that a trace of the smallstep semantics is also a trace of the big-step semantics as shown in Lemma 19. The lemma asserts that any trace c resulting from evaluating t is also related to t in the big-step semantics.

#### **Lemma 19.** <sup>C</sup>(t, p, e) <sup>c</sup> <sup>∗</sup> D(p, e ) *implies* <sup>t</sup> <sup>p</sup> <sup>e</sup> c*.*

The next step is to show that if c is a trace of the big-step semantics, then that trace is compatible with the partial order semantics.

**Lemma 20.** *If* <sup>t</sup> <sup>p</sup> <sup>e</sup> c *and* V(t, p, e) : v ≺ v *, then* v <sup>c</sup> v *.*

The proof of Theorem 15 follows from a transitive composition of Lemmas 19 and 20.

The real value of Theorem 15 is that it triangulates specifications, implementations, and formal analysis as depicted in Fig. 1. On one hand, the operational semantics is immediately implementable. This allows us to explicitly test and experiment with alternative options as specified in Copland. On the other hand, however, simple testing is not sufficient to understand the trust properties provided by alternative options. It is better to offer potential users the ability to analyze Copland phrases to establish (or refute) desired trust properties. This is the primary purpose of the event poset semantics. Our prior work on the analytic principles of layered attestation [17,18] is based on partially ordered sets of measurement and processing events. That work details how to characterize what an adversary would have to do in order to escape detection by a given collection of events. In particular, it establishes the fact that bottom-up strategies for measurement and evidence bundling force an adversary to perform either recent or deep corruptions. Recent corruptions must occur within a small time window, so it intuitively raises the bar for an adversary. Similarly, deep corruptions burrow into lower (and presumably better protected) systems layers also raising the bar for the adversary.

Although the event posets in Copland's denotational semantics are somewhat richer than those in [17,18], the reasoning principles can easily be adapted to this richer setting. This enables a verification methodology in which Copland phrases are compiled to event posets, then analyzed according to these principles. In this way, the relative strength of Copland phrases could be directly compared according to the trust properties they guarantee. Theorem 15 ensures that any conclusions made on the basis of this static analysis must also hold for dynamic executions conforming to the operational semantics. It essentially transfers formal guarantees into the world of concrete implementations. We are currently exploring methods to more explicitly leverage such formal analysis to help Maat users write local policies based on the relative strength of Copland phrases.

### **8 Related Work**

The concept of adapting an attestation to the layered structure of a target system is not new. The concept is already present in attestation systems like trusted boot [15] and Integrity Measurement Architecture (IMA) [19] which leverage a layered architecture to create static, boot-time or load-time measurements of system components. Other solutions have designed layered architectures to enable attestation of the runtime state of a system [10,22]. A major focus is on information flow integrity properties since this allows fine-grained, local measurements to be composed without having to measure the entire system [20]. The main contrast between this line of research and our work is that they fix the structure of an attestation based on the structure of the target architecture, whereas in our work, we support extensible attestation specifications that can be altered to suit many different architectures and many different contexts for trust decisions.

Coker et al. [4] present a general approach for using virtualization to achieve a layered architecture, and it presents generic principles for remote attestation suggesting the possibility of diverse, policy-based orchestrations of attestations. These principles have recently been extended in [13] in the context of cloud systems built with Trusted Platform Modules (TPMs) and virtual TPMs [9].

Several implementations of measurement and attestation (M&A) frameworks have been proposed to address the need for a central service to manage policies for the orchestration and collection of integrity evidence. The Maat framework, as described in Sect. 2, is being utilized by the authors as a testing ground for Copland. Maat provides a pluggable interface for Attestation Service Providers (ASPs), functional units of measurement which are executed by Attestation Protocol Blocks (APBs) after a negotiation between an attester and appraiser machine [16]. Another architecture, given in [8], implements a policy mechanism designed to allow the appraiser to ask for different conditions to be satisfied by the target for different types of interactions. The main focus is on determining suitability of the target system to handle sensitive data. Negotiation between systems and frameworks, and the supporting policy specification, are examples of places where Copland can be leveraged to provide a common language and understanding of attestation guarantees.

Another line of research has focused on hardware/software co-design for embedded devices to enable remote attestation on platforms that are constrained in various ways [2,6,7]. For example, the absence of a TPM can increase an adversary's ability to forge evidence. A careful co-design of hardware and software allows them to tailor attestation protocols to the particular structure of a target device. More recently, Multiple-Tier Remote Attestation (MTRA) extends this work with a protocol that is specifically targeted for the attestation of heterogeneous IoT networks [21]. This protocol uses a preparation stage to configure attestations where more-capable devices (those with TPMs, for example) provide a makeshift root of trust for less-capable devices and measurement of the entire network is distributed across the more-capable devices. We believe that Copland would be beneficial in specifying the complex set of actions required of these heterogeneous networks.

Finally, there has been some work on the semantics of attestation. Datta et al. [5] introduces a formal logic for trusted computing systems. Its semantics is similar to our operational semantics in that it works as a transition system on state configurations. The underlying programming language was designed specifically for the logic, and is considerably more complex than Copland. It was not designed to be used by implementations as part of a negotiation. Also, it seems the logic has only been applied to static measurements such as trusted boot. We also previously developed a formal approach to the semantics of dynamic measurement [17,18]. In this work we characterize the benefit of a bottom-up measurement strategy as constraining the adversary to corrupt quickly or deeply. These results are obtained based on a partial order of events consisting of measurements and evidence bundling. As discussed above, this basis is similar to our partially ordered event semantics. We explicitly provide such a semantics to leverage the formal results that can be obtained by such analysis. While our set of events is richer, we expect the methods of this line of research to apply.

### **9 Conclusion and Ongoing Work**

Copland serves as a basis for discussing the formal properties of attestation protocols under composition. We have described the denotational semantics of Copland by mapping phrases to evidence and to partially ordered event sets describing events associated with a phrase and constraints on event ordering. While the denotational semantics does not specify unique traces, it specifies event orderings mandatory for believing evidence resulting from evaluation.

We have described the operational semantics of Copland by associating phrases with a labeled transition system. States capture evidence and order execution while labels on transitions describe resulting events. The transitive closure of the LTS transition function describes traces associated with LTS execution.

We then show the small-step semantics generates traces that obey partial orderings specified by the denotational semantics. Furthermore, we show those orderings are preserved under protocol composition. This result is vital to the correctness of attestation outcomes whose validity is equally dependent on resulting evidence and the proper ordering of evidence gathering events.

Beyond the correctness proof, the most impactful contribution of Copland semantics is a foundation for testing and experimenting with layered attestation protocols, pushing the bounds of complexity and diversity of application. We are actively exploring advanced attestation scenarios between Maat Attestation Managers (AMs). Recall from the introduction that Maat is a policy-based measurement and attestation (M&A) framework which provides a centralized, pluggable service to gather and report integrity measurements [16]. The Maat team is leveraging Copland to test attestation scenarios involving the configuration of multiple instances of Maat in multi-realm and multi-party scenarios. In addition to its application to traditional Linux platforms, the Maat framework has been applied to IoT device platforms, where different configurations due to limited resources were explored [3]. We believe frameworks such as Maat provide a rich testing ground for the application of Copland as the basis of policy specification and negotiation across many kinds of system architectures, and are feeding the lessons learned in this application back into the on-going Copland research.

The authors are also using Copland as an implementation language for remote attestation protocols in other systems. A collection of Copland interpreters written in Haskell, F# and CakeML [12] running on Linux, Windows 10 and seL4 [11] provide a mechanism for executing Copland phrases. Each interpreter forms the core of an AM that receives phrases, calls the interpreter, and returns evidence. Additionally, the AMs maintain and protect keys associated places and policies mapping USM and KIM instances to specific implementations. Policies are critically important as they describe details of measurers held abstract within a phrase. Policies will eventually play a central role in negotiating attestation protocols among the various AMs implementing complex, layered attestations. A common JSON exchange format allows exchange of phrases and evidence among AMs running on different systems.

Of particular note, the CakeML interpreter targeting the seL4 platform will be formally verified with respect to the formal Copland semantics. CakeML implements a formally verified fragment of ML in the HOL4 proof system while seL4 provides a verified microkernel with VMM support. Verifying the Copland CakeML implementation and individual Copland phrases requires embedding the CakeML semantics in Coq. The Copland implementation will then be verified with respect to the formal semantics. Additionally, the Coq semantics supports proof search techniques for synthesizing Copland phrases. Running the CakeML implementation on the seL4 platform with formally synthesized phrases provides a verified attestation platform that may be retargeted to any environment supporting seL4.

As we continue exploring the richness of layered attestation we are also developing type systems and static checkers that determine correctness of specific protocols and protocol interpreters and compilers that produce provably correct results relative to Copland semantics. We are considering extensions to Copland that include nonces, lambda expressions, keys, and TPM interactions to represent a richer set of protocols. Without this formal semantics, it would be impossible to consider the correctness of such extensions.

### **A Annotated Terms**

As noted in Sect. 4, when t is annotated by i and j, we write [t] j <sup>i</sup> . The annotations are used in the Coq proofs to construct sequences of unique events associated with collecting the evidence specified by the term.

```
anno(i, CPY)=(i + 1, [CPY]
                              i+1
                              i )
anno(i, KIM p a¯)=(i + 1, [KIM p a¯]
                                     i+1
                                     i )
anno(i, USM a¯)=(i + 1, [USM a¯]
                                  i+1
                                  i )
anno(i, SIG)=(i + 1, [SIG]
                           i+1
                           i )
anno(i, HSH)=(i + 1, [HSH]
                              i+1
                              i )
anno(i, @p t) =
  let (j, a) anno(i + 1, t) in
  anno(j + 1, [@p a]
                    j+1
                    i )
anno(i, t1 t2) =
  let (j, a1) anno(i, t1) in
  let (k, a2) anno(j, t2) in
  anno(k, [a1 a2]
                    k
                    i )
anno(i, t1
          s
          ≺ t2) =
  let (j, a1) anno(i + 1, t1) in
  let (k, a2) anno(j, t2) in
  anno(k + 1, [a1
                   s
                  ≺ a2]
                        k+1
                        i )
anno(i, t1
          s
          ∼ t2) =
  let (j, a1) anno(i + 1, t1) in
  let (k, a2) anno(j, t2) in
  anno(k + 1, [a1
                   s
                  ∼ a2]
                        k+1
                        i )
```
**Fig. 9.** Term annotation

Terms are annotated using the function displayed in Fig. 9. An annotated term for t = KIM p a¯ → SIG is

$$\mathsf{anno}(0,t) = \left(2, [[\mathsf{KlM}\ p\ \bar{a}]\_0^1 \to [\mathsf{SlG}]\_1^2]\_0^2\right),$$

and when t = @<sup>p</sup> USM a¯,

$$\mathbf{a}\mathbf{n}\mathbf{n}\mathbf{o}(0,t) = (3, [\boldsymbol{\@{}}]\_p \, [\mathbf{U}\mathbf{S}\mathbf{M}\,\,\bar{a}]\_1^2 ]\_0^3).$$

**Lemma 21.** anno(i, t) ∈ Ti*.*

### **B Coq Cross Reference**

Table 1 matches the contents of a figure with its definition in the Coq proofs. Table 2 does the same for lemmas, definitions, and the theorem.


**Table 1.** Coq figure cross reference

**Table 2.** Coq cross reference


### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Verifying Liquidity of Bitcoin Contracts**

Massimo Bartoletti1(B) and Roberto Zunino<sup>2</sup>

<sup>1</sup> Universit`a degli Studi di Cagliari, Cagliari, Italy bart@unica.it

<sup>2</sup> Universit`a degli Studi di Trento, Trento, Italy

**Abstract.** A landmark security property of smart contracts is *liquidity*: in a non-liquid contract, it may happen that some funds remain frozen. The relevance of this issue is witnessed by a recent liquidity attack to the Ethereum Parity Wallet, which has frozen ∼160*M* USD within the contract, making this sum unredeemable by any user. We address the problem of verifying liquidity of Bitcoin contracts. Focussing on BitML, a contracts DSL with a computationally sound compiler to Bitcoin, we study various notions of liquidity. Our main result is that liquidity of BitML contracts is decidable, in all the proposed variants. To prove this, we first transform the infinite-state semantics of BitML into a finitestate one, which focusses on the behaviour of any given set of contracts, abstracting the context moves. With respect to the chosen contracts, this abstraction is sound and complete. Our decision procedure for liquidity is then based on model-checking the finite space of states of the abstraction.

**Keywords:** Bitcoin · Smart contracts · Verification

### **1 Introduction**

Decentralized ledgers like Bitcoin and Ethereum [19,32] enable the trustworthy execution of *smart contracts*—computer protocols which regulate the exchange of assets among mutually untrusted users. The underlying protocols used to update the ledger (which defines the state of each contract) ensure that, even without trusted intermediaries, the execution of contracts is correct with respect to the contract rules. However, it may happen that the rules themselves are not correct with respect to the behaviour expected by the users. Indeed, all the attacks to smart contracts successfully carried out so far, which have plundered or frozen millions of USD in Ethereum [1–3,8,27,30], exploit some discrepancy between the intended and the actual behaviour of a contract.

To counteract these attacks, the research community has recently started to formalize smart contracts and their security properties [22–24], and to develop automated verification tools based on these models [21,27,31,35]. As a matter of fact, most of this research is targeted to Ethereum, the most widespread (and attacked) platform for smart contracts: for this reason, the security properties addressed by current tools focus on specific features of Solidity, the high-level language for smart contracts in Ethereum. For instance, some vulnerability patterns checked by these tools are reentrancy and mishandled exceptions, whose peculiar implementation in Solidity has led to attacks, like to one to the DAO [1]. Only a few tools verify *general* security properties of smart contracts, that would be meaningful also outside the realm of Ethereum. Among these works, [35] checks a property called *liquidity*, which holds when the contract always admits a trace where its balance is decreased (so, the funds stored within the contract do not remain frozen). This has been inspired from a recent attack to Ethereum [2], which has frozen ∼160M USD within a contract, exploiting a bug in a library. While being capable of classifying this particular contract as non-liquid, any contract where the adversary can lock some funds and redeem them at a later moment would be classified as liquid. Stronger notions of liquidity may rule out these unsafe contracts, e.g. by checking that funds are never frozen *for all* possible strategies of the adversary. Studying liquidity in a more general setting would be important for various reasons. First, taking into account adversaries would allow to detect more security issues w.r.t. those checked by the current verification tools. Second, platform-agnostic notions of liquidity could be applied to the forthcoming blockchain technologies, e.g. [20,34]. Third, studying liquidity in simpler settings than Ethereum could simplify the verification problem, which is undecidable in Turing-powerful languages like those supported by Ethereum.

**Contributions.** We study several notions of liquidity for smart contracts, in a general setting where their behaviour is defined as a transition system. We then consider the special case where contracts are expressed in BitML, a highlevel DSL for smart contracts which compiles into Bitcoin [14]. In such setting, we develop a verification technique for liquidity of smart contracts. We can summarise our main contributions as follows:


Our finite-state abstraction is general-purpose: verifying liquidity is only one of its possible applications (some other applications are discussed in Sect. 6).

**Related Works.** Several recent works study security issues related to Ethereum smart contracts. A few papers address EVM, the bytecode language which is the target of compilation of Solidity. Among them, [27] introduces an operational semantics of a simplified version of EVM, and develops Oyente, a tool to detect some vulnerability patterns of EVM contracts through symbolic execution. Securify [35] checks vulnerability patterns by analysing dependency graphs extracted from EVM code. As mentioned before, this tool also addresses a form of liquidity, which essentially assumes a cooperating adversary. EtherTrust [21] is a framework for the static verification of EVM contracts, which can establish e.g. the absence of reentrancy vulnerabilities. This tool is based on the detailed formalisation of EVM provided in [22], which is validated against the official Ethereum test suite. The work [23] introduces an executable semantics of EVM, specified in the K framework. The tool in [18] translates Solidity and EVM code into F∗, and use its verification tools to detect vulnerabilities of contracts; further, the tool verifies the equivalence between a Solidity program and an alleged compilation of it into EVM. The work [24] verifies EVM code through the Isabelle/HOL proof assistant [33], proving that, upon an invocation of a specific contract, only its owner can decrease the balance.

Smart contracts in Bitcoin have a completely different flavour compared to Ethereum, since they are usually expressed as cryptographic protocols, rather than as programs. Despite the limited expressiveness of the scripts in Bitcoin transactions [10], several kinds of contracts for Bitcoin have been proposed [9]: they range from lotteries [6,7,13,29], to general multiparty computations [4,17,26], to contingent payments [11,28], etc. All these works focus on proving the security of a *fixed* contract, unlike the above-mentioned works on Ethereum, where the goal is to verify arbitrary contracts. As far as we know, only a couple of works pursue this goal for Bitcoin. The tool in [25] analyses Bitcoin scripts, in order to find under which conditions the enclosing transaction can be redeemed. Compared to [25], our work verifies contracts spanning among many transactions, rather than single scripts. The work [5] models contracts as timed automata, and then uses the Uppaal model checker [16] to verify their properties. The contracts modelled as in [5] cannot be directly translated to Bitcoin, while in our approach we can exploit the BitML compiler to translate contracts to standard Bitcoin transactions. Note also that the properties considered in [5] are specific to the modelled contract, while in this work we are interested in verifying general properties of contracts, like liquidity.

### **2 Overview**

In this section we briefly overview BitML; we then give some intuition about liquidity and our verification technique. Because of space limits, we refer to [14] for a detailed treatment of BitML, and to [12] for a more gentle introduction.

We assume a set of *participants*, ranged over by A,B,..., and a set of names, of two kinds: x,y,... denote *deposits* of B, while a, b,... denote *secrets*. We write *x* (resp. *a*) for a finite sequence of deposit (resp. secrets) names.

#### **2.1 BitML in a Nutshell**

BitML is a domain-specific language for Bitcoin smart contracts, which allows participants to exchange cryptocurrency according to pre-agreed contract rules. In BitML, any participant can broadcast a *contract advertisement* {G}C, where


**Fig. 1.** Syntax of BitML contracts and preconditions.


**Fig. 2.** Syntax of predicates.

C is the actual contract, specifying the rules to transfer bitcoins (B), while G is a set of *preconditions* to its execution.

Preconditions (Fig. 1, left) may require participants to deposit some B in the contract (either upfront or at runtime), or to commit to some secret. More in detail, A:! v @ x requires A to own vB in a deposit x, and to spend it for stipulating a contract C. Instead, A:? v @ x only requires A to pre-authorize the spending of x, which can be gathered by the contract at run-time. The precondition A:secret a requires A to commit to a secret a before C starts.

After {G}<sup>C</sup> has been advertised, each participant can choose whether to accept it, or not. When all the preconditions G have been satisfied, and all the involved participants have accepted, the contract C becomes *stipulated*. The contract starts its execution with a balance, initially set to the sum of the ! deposits required by its preconditions. Running C will affect this balance, when participants deposit/withdraw funds to/from the contract.

A contract C is a *choice* among zero or more branches. Each branch is a *guarded contract* (Fig. 1, right) which enables an action, and possibly proceeds with a continuation C . The guarded contract withdraw A transfers the whole balance to <sup>A</sup>, while split <sup>v</sup><sup>1</sup> <sup>→</sup> <sup>C</sup><sup>1</sup> | ··· | <sup>v</sup><sup>n</sup> <sup>→</sup> <sup>C</sup><sup>n</sup> decomposes the contract into n parallel components Ci, each one with balance vi. The guarded contract put *x* & reveal *a* if p atomically performs the following: (i) spend all the ? deposits *x*, adding their values to the contract balance; (ii) check that all the secrets *a* have been revealed and satisfy the predicate p (Fig. 2). When enabled, the above-mentioned actions can be fired by anyone, at anytime. To restrict *who* can execute actions and *when*, one can use the decoration A : D, which requires the authorization of A, and the decoration after t : D, which requires to wait until time t.

**A Basic Example.** As a first example, we express in BitML the *timed commitment* [6], a basic protocol to construct more complex contracts, like e.g. lotteries and other games [7]. In the timed commitment, a participant A wants to choose a secret, and promises to reveal it before some time t. The contract ensures that if A does not reveal the secret in time, then she will pay a penalty of 1B to B (e.g., the opponent player in a game). In BitML, this is modelled as follows:

### {A:! <sup>1</sup> @ <sup>x</sup> <sup>|</sup> <sup>A</sup>:secret <sup>a</sup>} (reveal a. withdraw <sup>A</sup> <sup>+</sup> after <sup>t</sup> : withdraw <sup>B</sup>)

The precondition requires A to pay upfront 1B, and to commit to a secret a. The contract (hereafter, named *TC* ) is a non-deterministic choice between two branches. Only A can choose the first branch, by performing reveal a (syntactic sugar for put [] & reveal a if *true*). Subsequently, anyone can transfer 1B to A. Only after t, if the reveal has not been fired, any participant can fire withdraw B in the second branch, moving 1B to B. So, before t, A has the option to reveal a (avoiding the penalty), or to keep it secret (paying the penalty). If no branch is taken by t, the first one who fires its withdraw gets 1B.

#### **2.2 BitML Semantics**

We briefly recall from [14] the semantics of BitML. The semantics is a labelled transition system between configurations of the following form:


We now illustrate the BitML semantics by examples; when time is immaterial, we only show the steps of the untimed semantics. We omit labels on transitions.

**Deposits.** When <sup>A</sup> owns a deposit A, vx, she can use it in various ways: she can divide the deposit into two smaller deposits, or join it with another deposit of hers to form a larger one; the deposit can also be transferred to another participant, or destroyed. For instance, to donate a deposit x to B, A must first issue the authorization A[x -B]; then, anyone can transfer the money to B:

$$\langle \mathsf{A}, v \rangle\_x \mid \dots \rangle \to \langle \mathsf{A}, v \rangle\_x \mid \mathsf{A}[x \rhd \mathsf{B}] \mid \dots \rangle \to \langle \mathsf{B}, v \rangle\_y \mid \dots \rangle \tag{y \text{ fresh}}$$

We assume that whenever a participant authorizes an operation on some deposit x, then she is also authorising a self-donation A[x -A] of such deposit.<sup>1</sup>

<sup>1</sup> This assumption, while helpful to simplify the subsequent technical development, does not allow an adversary to steal money; at worst, the adversary can use the authorization to transfer the money back to the original owner.

**Advertisement.** Any participant can advertise a new contract C (with preconditions <sup>G</sup>). This is obtained by performing the step <sup>Γ</sup> −→ <sup>Γ</sup> | {G}C.

**Stipulation.** Stipulation turns a contract advertisement into an active contract. For instance, let <sup>G</sup> <sup>=</sup> <sup>A</sup>:! <sup>1</sup> @ <sup>x</sup> <sup>|</sup> <sup>A</sup>:? <sup>1</sup> @ <sup>y</sup> <sup>|</sup> <sup>A</sup>:secret <sup>a</sup>. Given a contract <sup>C</sup>, the stipulation of {G}<sup>C</sup> is done in a few steps:

$$\{\mathsf{A},1\}\_x \mid \langle \mathsf{A},1\rangle\_y \mid \{G\}C \to^\* \langle \mathsf{A},1\rangle\_y \mid \langle C,1\rangle\_z \mid \{\mathsf{A}:a\#N\}$$

Above, the funds in the deposit x are transferred to the newly created contract, to fulfill the precondition A:! 1 @ x. Instead, the deposit y remains in the configuration, to be possibly spent after some time. The component {<sup>A</sup> : <sup>a</sup>#N} represents the secret committed to by A, with its length N.

**Withdraw.** Executing withdraw A terminates the contract, and transfers its whole balance to A by creating a fresh deposit owned by A:

$$\langle \mathsf{withdraw \AA} + C', v \rangle\_x \to \langle \mathsf{A}, v \rangle\_y \tag{y \text{ fresh}}$$

Above, withdraw A is executed as a branch within a choice: as usual, taking a branch discards the other ones (denoted as C ).

**Split.** The split primitive can be used to spawn several new concurrent contracts, dividing the balance among them. For instance:

$$\langle (\mathtt{sp1it}\ v\_1 \to C\_1 \mid v\_2 \to C\_2), v\_1 + v\_2 \rangle\_x \to \langle C\_1, v\_1 \rangle\_y \mid \langle C\_2, v\_2 \rangle\_z \tag{y, z \text{ fresh}} \rangle$$

**Put & Reveal.** A prefix put z & reveal a if p can be fired when the previously committed secret a (satisfying the predicate p) has been revealed, and the deposit z is available in the configuration. For instance:

$$\begin{aligned} & \langle \mathtt{put} \, z \,\&\textbf{revea1} \, a \,\textbf{if} \, |a| = N . C, \, v \rangle\_x \mid \langle \mathtt{A}, v' \rangle\_z \mid \{ \mathtt{A} : a \# N \} \\ & \to \langle \mathtt{put} \, z \,\&\textbf{revea1} \, a \,\textbf{if} \, |a| = N . C, \, v \rangle\_x \mid \langle \mathtt{A}, v' \rangle\_z \mid \mathtt{A} : a \# N \\ & \to \langle C, v + v' \rangle\_y \mid \mathtt{A} : a \# N \end{aligned}$$

In the first step, A reveals her secret a. In the second step, any participant fires the prefix; doing so rakes the deposit z within the contract.

**Authorizations.** When a branch is decorated by <sup>A</sup> : ··· it can be taken only after A has provided her authorization. For instance:

$$\begin{aligned} \langle \mathsf{A} : \mathsf{withdraw} \, \mathsf{B} + \mathsf{A} : \mathsf{withdraw} \, \mathsf{C}, v \rangle\_{x} \\ \rightarrow \langle \mathsf{A} : \mathsf{withdraw} \, \mathsf{B} + \mathsf{A} : \mathsf{withdraw} \, \mathsf{C}, v \rangle\_{x} \mid \mathsf{A} [x \rhd \mathsf{A} : \mathsf{withdraw} \, \mathsf{B}] \rightarrow \langle \mathsf{B}, v \rangle\_{y} \end{aligned}$$

In the first step, A authorizes to take the branch withdraw B. After that, any participant can fire such branch.

**Time.** We always allow time t to advance by a delay δ > 0, through a transition <sup>Γ</sup> <sup>|</sup> <sup>t</sup> −→ <sup>Γ</sup> <sup>|</sup> <sup>t</sup> <sup>+</sup> <sup>δ</sup>. Advancing time can enable branches decorated with after <sup>t</sup>. For instance, if <sup>t</sup><sup>0</sup> <sup>+</sup> <sup>δ</sup> <sup>≥</sup> <sup>t</sup>, we have the following computation:

$$\begin{aligned} & \langle (\mathtt{after} \, t : \mathtt{with} \mathtt{traw} \, \mathbb{B}) + C', v \rangle\_x \mid t\_0 \\ & \rightarrow \langle (\mathtt{after} \, t : \mathtt{with} \mathtt{traw} \, \mathbb{B}) + C', v \rangle\_x \mid t\_0 + \delta \rightarrow \langle \mathbb{B}, v \rangle\_y \mid t\_0 + \delta \end{aligned}$$

**Runs and Strategies.** A *run* R is a (possibly infinite) sequence:

$$
\varGamma\_0 \mid t\_0 \xrightarrow{\ell\_0} \varGamma\_1 \mid t\_1 \xrightarrow{\ell\_1} \cdots
$$

where <sup>i</sup> are the transition labels, Γ<sup>0</sup> contains only deposits, and t<sup>0</sup> = 0. If R is finite, we write Γ<sup>R</sup> for its last untimed configuration, and δ<sup>R</sup> for its last time. A *strategy* Σ<sup>A</sup> is a PPTIME algorithm which allows A to select which actions to perform (possibly, time delays), among those permitted by the BitML semantics. The choice among these actions is controlled by the adversary strategy ΣAdv , which acts on behalf of all the dishonest participants. Given the strategies of all participants (including Adv), there is a unique run *conforming* to all of them.

#### **2.3 Liquidity**

A desirable property of smart contracts is *liquidity*, which requires that the contract balance is always eventually transferred to some participant. In a nonliquid contract, funds can be frozen forever, unavailable to anyone, hence effectively destroyed. There are many possible flavours of liquidity, depending e.g. on which participants are assumed to be honest, and on which are their strategies. The simplest form of liquidity is to consider the case where everyone cooperates: i.e. a contract is liquid if there exists some strategy for each participant such that no funds are ever frozen. However, this notion does not capture the essence of smart contracts, i.e. to allow mutually untrusted participants to safely interact.

For instance, consider the following contract, where A and B contribute 1B each for a donation of 2B to either C or D (we omit the preconditions for brevity):

 $\mathbb{A} : \mathbb{B} : \textbf{withraw}$   $\mathbb{C} + \mathbb{A} : \mathbb{B} : \textbf{withraw}$   $\mathbb{D}$ 

In order to unlock the funds, A and B must agree on the recipient of the donation, by giving their authorization on the same branch. This contract would be liquid only by assuming the cooperation between A and B: indeed, A alone cannot guarantee that the 2B will eventually be donated, as B can choose a different recipient, or even refuse to give any authorization. Consequently, unless A trusts B, it makes sense to consider this contract as non-liquid, from the point of view of A (and for similar reasons, also from that of B).

Consider now the timed commitment contract discussed before:

### reveal a. withdraw A + after t : withdraw B

This contract is liquid from A's point of view (even if B is dishonest), because A can reveal the secret and then redeem the funds from the contract. The timed commitment is also liquid from B's point of view: if A does not reveal the secret (making the first branch stuck), the funds in the contract can be redeemed through the second branch, after time t.

In a *mutual* timed commitment contract, where A and B have to exchange their secrets or pay a 1B penalty, achieving liquidity is a bit more challenging. We first consider a wrong attempt:

### reveal a. reveal b. split (1B <sup>→</sup> withdraw <sup>A</sup> <sup>|</sup> 1B <sup>→</sup> withdraw <sup>B</sup>) + after t : withdraw B

Intuitively, A has only the following strategies, according to when she decides to reveal her secret a: (i) A chooses to reveal a unconditionally, and to perform the reveal a action. This strategy is *not* liquid: indeed, if B does not reveal b, the contract is stuck. (ii) A chooses to reveal a only *after* B has revealed b. This strategy is *not* liquid: indeed, if B chooses not to reveal b, the contract will never advance. (iii) A chooses to wait until B reveals secret b, or until time t <sup>≥</sup> <sup>t</sup>, whichever comes first. If <sup>b</sup> was revealed, <sup>A</sup> reveals <sup>a</sup>, and splits the contract balance between A and B. Otherwise, if the deadline t is expired, A transfers the whole balance to B. Note that, although this strategy is liquid, it is not satisfactory for A, since in the second case she will lose money.

This example highlights a crucial point: participants' strategies have to be taken into account when defining liquidity. Indeed, the mere fact that a liquid strategy exists does not imply that it is the ideal strategy for the honest participant. To fix this issue, we revise the mutual timed commitment as follows:

```
reveal a. -

            reveal b. split (1B → withdraw A | 1B → withdraw B)
          + after t
                    : withdraw A

+ after t : withdraw B
```
where t<t . Now, A has a liquid strategy where she does not pay the penalty. First, A reveals a before time t. After that, if B reveals b, then A can execute the split, transferring 1B to herself and 1B to B (note that this does not require B's cooperation); otherwise, after time t , A can withdraw 2B by executing the withdraw A in the after t : ··· branch.

These examples, albeit elementary, show that detecting if a strategy is liquid for a contract is not straightforward, in general. The problem of determining a liquid strategy for a given contract seems even more demanding. Automatic techniques for the verification and inference of liquid strategies can be useful tools for the developers of smart contracts.

#### **2.4 Verifying Liquidity**

One of the main contributions of this paper is a verification technique for the liquidity of BitML contracts. Our technique is based on a more general result, i.e. a strict correspondence between the semantics of BitML in [14] (hereafter, called *concrete* semantics) and a new abstract semantics, which is finite-state (Theorem 1). Our abstraction is a correct and complete approximation of the concrete semantics with respect to a given set of contracts (Theorems 2 and 3). To obtain a finite-state abstraction, we need to cope with three sources of infiniteness of the concrete semantics of BitML: the unbounded passing of time, the advertisement/stipulation of new contracts, and the operations on deposits. Our abstraction replaces the time t in concrete configurations with a finite number of time intervals T = [t0, t1), and it disables the transitions to advertise new contracts. Further, the only operations on deposits allowed by the abstract semantics are the ones for transferring them to contracts and for destroying them. The latter is needed e.g. to properly model the situation where a participant spends a ?-deposit.

The intended use of our abstraction is to start from a configuration containing an arbitrary (but finite) set of contracts, and then analyse their possible evolutions in the presence of an honest participant and an adversary. This produces a finite set of (finite) traces, which we can model-check for liquidity. Soundness and completeness of the abstraction are exploited to prove that liquidity is decidable (Theorem 4). The computational soundness of the BitML compiler [14] guarantees that if a contract is verified to be liquid according to our analysis, this property is preserved when executing it on Bitcoin.

### **3 Liquidity**

In this section we formalise a notion of liquidity of contracts, and we suggest some possible variants. Aiming at generality, liquidity is parameterised over (i) a set X of contract names, uniquely identifying the contracts under observation; (ii) a participant A (with her strategy Σ<sup>A</sup> ), which we assume to be the only honest participant in the system. Roughly, we want that the funds stored within the contracts X are eventually transferred to some participant, in any run conforming to A's strategy. The actual definition is a bit more complex, because the other participants may play against A, e.g. avoiding to reveal their secrets, or to give their authorizations for some branch.

We start by introducing an auxiliary partial function *orig*<sup>R</sup><sup>0</sup> (R, x) that, given a contract name x and an extension R of a run R0, determines the ancestor y of <sup>x</sup> in the last configuration of <sup>R</sup>0, if any. Intuitively, *orig*<sup>R</sup><sup>0</sup> (R, x) = <sup>y</sup> means that y has evolved into R, eventually leading to x (and possibly to other contracts).

In BitML, there are only two ways to make a contract evolve into another contract. First, a split can spawn new contracts, e.g.:

$$\langle \mathsf{sp1it} \ (v\_1 \to C\_1 \mid v\_2 \to C\_2), v\_1 + v\_2 \rangle\_x \xrightarrow{split(x)} \langle C\_1, v\_1 \rangle\_{y\_1} \mid \langle C\_2, v\_2 \rangle\_{y\_2}$$

Here, both y<sup>1</sup> and y<sup>2</sup> have x as ancestor. Second, put&reveal reduces as follows:

$$\langle \mathtt{put} \, z \,\& \mathtt{revea1} \, a \, . C, v \rangle\_x \mid \langle \mathtt{A}, v' \rangle\_z \mid \cdots \mid \frac{\textit{put} \, z \, a \, x}{} \mid C, v + v' \rangle\_y \mid \cdots \rangle$$

In this case, the ancestor of y is x.

$$\begin{aligned} \operatorname{orig}\_{\mathcal{R}\_0}(\mathcal{R}\_0, x) &= x \quad \text{if } x \in \operatorname{cn}(\Gamma\_{\mathcal{R}\_0})\\ \operatorname{orig}\_{\mathcal{R}\_0}(\mathcal{R'} \xrightarrow{\ell} I, x) &= \begin{cases} \operatorname{orig}\_{\mathcal{R}\_0}(\mathcal{R'}, x) & \text{if } x \in \operatorname{cn}(\mathcal{R'})\\ \operatorname{orig}\_{\mathcal{R}\_0}(\mathcal{R'}, y) & \text{if } \begin{array}{l} x \in \operatorname{cn}(\mathcal{R'} \xrightarrow{\ell} I) \text{ } \operatorname{cn}(\mathcal{R'}) \text{ and} \\ (\ell = \operatorname{split}(y) \text{ or } \ell = \operatorname{put}(\mathfrak{z}, \mathfrak{a}, y)) \end{array} \end{aligned}$$

**Definition 1.** *Let* R *be a run extending some run* R0*, and let* x *be a contract name. We define orig*<sup>R</sup><sup>0</sup> (R, x) *by induction on the length of* <sup>R</sup> *in Fig. 3, where* cn(Γ) *denotes the set of contract names in* Γ*.*

*Example 1.* Let <sup>R</sup><sup>0</sup> be a run with last configuration <sup>Γ</sup><sup>R</sup><sup>0</sup> <sup>=</sup> C1, v<sup>y</sup> | A, v<sup>z</sup> , and let R be the following extension of R0, where the contracts C<sup>1</sup> and C<sup>2</sup> are immaterial, but for the fact that they enable the displayed moves:

$$
\langle \langle C\_1, v \rangle\_y \mid \langle \mathbb{A}, v \rangle\_z \to \langle C\_1, v \rangle\_y \mid \langle \mathbb{A}, v \rangle\_z \mid \{G\} C\_2 \to^\* \langle C\_1, v \rangle\_y \mid \langle C\_2, v \rangle\_x
$$

$$
\xrightarrow{split(x)} \langle C\_1, v \rangle\_y \mid \langle C'\_2, v \rangle\_{x'}
$$

$$
\xrightarrow{split(y)} \langle C'\_1, v' \rangle\_{y'} \mid \langle C'\_1, v - v' \rangle\_{y''} \mid \langle C'\_2, v \rangle\_{x'}
$$

We have that *orig*<sup>R</sup><sup>0</sup> (R, y ) = *orig*<sup>R</sup><sup>0</sup> (R, y) = <sup>y</sup>, since the corresponding contracts have been obtained through a split of the ancestor y, which was in the last configuration of <sup>R</sup>0. Instead, *orig*<sup>R</sup><sup>0</sup> (R, x ) is undefined, because its ancestor <sup>x</sup> is not in <sup>R</sup>0. Further, *orig*<sup>R</sup><sup>0</sup> (R, y) = <sup>y</sup>, while *orig*<sup>R</sup><sup>0</sup> (R, x) is undefined.

We now formalise liquidity. Assume that we want to observe a single contract x, occurring in the last configuration of some run R<sup>0</sup> (note that x has been stipulated at some point during R0). A participant A wants to know if the strategy Σ<sup>A</sup> allows her to make x evolve so that funds are never frozen within the contract. We require that A can do this *without* the help of the other participants, which therefore we model as a single adversary Adv. More precisely, we say that x is liquid for A when, after any extension R of R0, Σ<sup>A</sup> can choose a sequence of moves so to make all the descendant contracts of x terminate, transferring their funds to some participant (possibly not A). Note that such moves can not reveal secrets of other participants, or generate authorizations for them: A must be able to unfreeze the funds on her own, using her strategy. By contrast, R can also involve such moves, but it must conform to A's strategy. The actual definition of liquidity generalises the above to sets X<sup>0</sup> of contract names.

**Definition 2 (Liquidity).** *Let* A *be an honest participant, with strategy* Σ<sup>A</sup> *, let* R<sup>0</sup> *be a run, and let* X<sup>0</sup> *be a set of contract names in* Γ<sup>R</sup><sup>0</sup> *. We say that* X<sup>0</sup> *is* liquid w.r.t. Σ<sup>A</sup> in R<sup>0</sup> *if, for all finite extensions* R *of* R<sup>0</sup> *conforming to* Σ<sup>A</sup> *and to some* <sup>Σ</sup>Adv *, there exists an extension* <sup>R</sup> <sup>=</sup> <sup>R</sup> -<sup>1</sup> −→··· <sup>n</sup> −→ *of* R *such that:*

$$\forall i \in 1..n \;:\; \ell\_i \in \Sigma\_{\mathsf{A}}(\mathscr{R} \xrightarrow{\ell\_1} \cdots \xrightarrow{\ell\_{i-1}}) \tag{1}$$

$$x \in \operatorname{cn}(\Gamma\_{\mathbb{R}'}) \implies \operatorname{orig}\_{\mathcal{R}\_0}(\mathbb{R}', x) \notin X\_0 \tag{2}$$

Condition (1) requires that all the moves after R can be taken by A alone, conforming to her strategy. Condition (2) checks that R no longer contains descendants of the contracts X0: since in BitML active contracts always store some funds, this is actually equivalent to checking that funds are not frozen.

We remark that, although Definition 2 is instantiated on BitML, the basic concepts it relies upon (runs, strategies, termination of contracts) are quite general. Hence, our notion of liquidity, as well as the variants proposed below, can be applied to other languages for smart contracts, using their transition semantics.

*Example 2.* Recall the timed commitment contract *TC* from Sect. 2. Assume that <sup>A</sup>'s strategy is to wait until time <sup>t</sup> <sup>−</sup> 1 (i.e., one time unit before the deadline), then reveal the secret and fire withdraw A. Let R<sup>0</sup> be a run with final configuration *TC* , 1B<sup>x</sup> | {<sup>A</sup> : <sup>a</sup>#N}, for some length <sup>N</sup>. We have that {x} is liquid w.r.t. Σ<sup>A</sup> in R0, while it is *not* liquid w.r.t. the strategy where A does not reveal the secret, or reveals it without firing withdraw A. Indeed, under these strategies A alone cannot make x terminate.

*Example 3.* Consider the following two contracts, which both require as precondition that A put a deposit of 2B and commits to a secret a, and where p is an arbitrary predicate on a:

$$C\_1 = \mathtt{revea1} \, a \, \mathtt{if} \, p. \mathtt{with} \mathtt{raw} \, \mathbb{A} + \mathtt{revea1} \, a \, \mathtt{if} \, \neg p. \mathtt{with} \mathtt{raw} \, \mathbb{B}$$

$$C\_2 = \mathtt{sp1} \, \mathtt{it} \, \, 1 \mathtt{\eth} \to \mathtt{revea1} \, a \, \mathtt{if} \, p. \mathtt{with} \mathtt{raw} \, \mathbb{A}$$

$$|1\mathtt{\eth} \to \mathtt{revea1} \, a \, \mathtt{if} \, \neg p. \mathtt{with} \mathtt{raw} \, \mathtt{B}$$

Assume that A's strategy is to reveal the secret, and then fire any enabled withdraw. Under this strategy, C<sup>1</sup> is liquid, because one of the reveal branches is enabled, and the corresponding withdraw is fired, transferring 2B either to A or to B. Instead, no strategy of A can make C<sup>2</sup> liquid. If A does not reveal the secret, then the 2B are frozen; otherwise, if A reveals the secret, then only one of the two descendents of C<sup>2</sup> can fire the reveal, and so 1B remains frozen.

*Example 4 (Lottery).* Consider a lottery between two players. The preconditions require A and B to commit to one secret each (a and b, respectively), and to put a deposit of 3B each (1B as a bet, and 2B as a penalty for dishonest behaviour):

*Lottery*(*Win*) = split- 2B <sup>→</sup> (reveal <sup>b</sup> if <sup>0</sup> ≤ |b| ≤ <sup>1</sup>. withdraw <sup>B</sup>)+(after <sup>t</sup> : withdraw <sup>A</sup>) <sup>|</sup> 2B <sup>→</sup> (reveal a. withdraw <sup>A</sup>)+(after <sup>t</sup> : withdraw <sup>B</sup>) | 2B → *Win Win* <sup>=</sup> reveal a b if <sup>|</sup>a<sup>|</sup> <sup>=</sup> <sup>|</sup>b|. withdraw <sup>A</sup> <sup>+</sup> reveal a b if <sup>|</sup>a<sup>|</sup> <sup>=</sup> <sup>|</sup>b|. withdraw <sup>B</sup>

The contract splits the balance in three parts, of 2B each. The first part allows B to reveal b and then redeem 2B; otherwise, after the deadline A can redeem B's penalty (as in the timed commitment). Similarly, the second part allows A to redeem 2B by revealing a. To determine the winner we compare the secrets, in the subcontract *Win*: A wins if the secrets have the same length, otherwise B wins. This lottery is *fair*, since: (i) if both players are honest, then they will reveal their secrets within the deadlines (redeeming 2B each), and then they will have a 1/2 probability of winning<sup>2</sup>; (ii) if a player is dishonest, not revealing the secret, then the other player has a positive payoff, since she can redeem 4B.

Although fair, *Lottery*(*Win*) is non-liquid w.r.t. *any* strategy of A. Indeed, if B does not reveal his secret, then the 2B stored in the *Win* subcontract are frozen. We can recover liquidity by replacing *Win* with the following:

$$\begin{aligned} Win\_2 &= Win + (\texttt{after} \, t' : \texttt{reveal } a . \texttt{with{\texttt{raw} }} \, a) \\ &+ (\texttt{after} \, t' : \texttt{reveal } b . \texttt{with{\texttt{raw} }} \, \mathbb{B}) \end{aligned}$$

where t > t. In this case, even if B does not reveal b, A can use a strategy firing any enabled withdraw at time t , to unfreeze the 2B stored in *Win*2.

We now present some variants of the notion of liquidity presented before.

**Multiparty Liquidity.** A straightforward generalisation of liquidity is to assume a set of honest participants (rather than just one). In this case, we can extend Definition 2 by requiring that the run R conforms to the strategies of all honest participants, and the moves in (1) can be taken by any honest participant.

We illustrate this notion through the following escrow contract between two participants A and B, where the precondition requires A to deposit 1B:

```
Escrow = A : withdraw B + B : withdraw A + A : Resolve + B : Resolve
Resolve = split(0.1B → withdraw M
              | 0.9B → M : withdraw A + M : withdraw B)
```
After the contract has been stipulated, A can choose to pay B, by authorizing the first branch. Similarly, B can allow A to take her money back, by authorizing the second branch. If they do not agree, any of them can invoke a mediator M to resolve the dispute, invoking a *Resolve* branch. There, the 1B deposit is split in two parts: 0.1B go to the mediator, while 0.9B are assigned either to A and B, depending on M's choice.

Assuming that only A is honest, this contract does not admit any liquid strategy for A, according to Definition 2. This is because B can invoke the mediator, who can refuse to act, freezing the funds within the contract. Similarly, B alone has no liquid strategy, as well as M. Instead, *Escrow* admits a liquid multiparty strategy for any pair of honest participants. For instance, if A and M are honest, their strategies could be the following. A chooses whether to authorize

<sup>2</sup> Note that B could increase his probability to win the lottery by choosing a secret with length *N >* 1. However, doing so will make B lose his 2B deposit in the first part of split, and so <sup>B</sup>'s *average* payoff would be negative.

the first branch or not; in the first case, she fires withdraw B; otherwise, if B gives his authorization within a certain deadline, then A withdraws 1B; if not, after the deadline A invokes M. The strategy of M is to authorize some participant to redeem the 0.9B, and to fire all the withdraw within *Resolve*.

**Strategyless Liquidity.** Another variant of liquidity can be obtained by inspecting only the contract, neglecting A's strategy. In this case, we consider the contract as liquid when there exists some strategy of A which satisfies the constraints in Definition 2. For instance, the contract B : withdraw A is nonliquid from A's point of view, according to this notion, while it would be liquid for B.

**Quantitative Liquidity.** Definition 2 requires that no funds remain frozen within the contract. However, in some cases A could accept the fact that a portion of the funds remain frozen, especially when these funds would be ideally assigned to other participants. Following this intuition, we could define a contract v*-liquid* w.r.t. Σ<sup>A</sup> if at least v bitcoins are guaranteed to be redeemable. If the contract uses only !-deposits, the special case where v is the sum of all these deposits corresponds to the notion in Definition 2. For instance, *Lottery*(*Win*) from Example 4 is non-liquid for any strategy of A, but it is 4B-liquid if A's strategy is to reveal her secret, and perform all the enabled withdraw. Instead, *Lottery*(*Win*2) is 6B-liquid, and then also liquid, under this strategy.

A refinement of this variant could require that at least vB are transferred to A, rather than to any participant. Under this notion, both *Lottery*(*Win*) and *Lottery*(*Win*2) would be 2B-liquid for A. Further, *Lottery*(*Win*2) would be 4B-liquid in case A wins the lottery.

**Liquidity with Unknown Secrets.** All the notions of liquidity proposed so far depend on the initial run R0, which contains the lengths of the committed secrets. For instance, consider the run ending with the following configuration:

```
{B : b#0}|(reveal b if |b| = 1. B : withdraw A) + withdraw A, 1Bx
```
Since the length of b is zero, the reveal branch cannot be taken, so A has a liquid strategy (e.g., fire the withdraw A). Instead, in an alternative initial run where B chooses a secret of length 1, A has no liquid strategy, since B can reveal the secret and then deny his authorization, freezing 1B.

In practice, when A performs the liquidity analysis, she does not know the secrets of other participants. To be safe, A should use a worst-case analysis, which would regard the contract (reveal <sup>b</sup> if <sup>|</sup>b<sup>|</sup> = 1. <sup>B</sup> : withdraw <sup>A</sup>) + withdraw <sup>A</sup> as non-liquid. We can obtain such worst-case analysis by verifying liquidity (in the flavour of Definition 2) for all possible choices of the lengths of Adv's secrets. Although there is an infinite set of such lengths, each contract only checks a finite set of if conditions. Hence, the infinite set of lengths can be partitioned into a finite set of regions, which can be used as samples for the analysis. In this way, the basic liquidity analysis is performed a finite number of times.

Similar worst-case analyses can be obtained for all the other above-mentioned variants of liquidity. An average-case analysis can be obtained by assuming to know the probability distribution of A's secrets lengths, partitioning secrets lengths like in the worst-case analysis.

**Other Variants.** Mixing multiparty and strategyless liquidity, we obtain the notion of liquidity used in [35], in the context of Ethereum smart contracts. This notion considers a contract liquid if there exists a collaborative strategy of all participants that never freezes funds. Other variants may take into account the time when funds become liquid, the payoff of strategies (e.g., ruling out irrational adversaries), or fairness issues. Note indeed that Definition 2 already assumes a sort of fairness, by effectively forbidding the adversary to interfere when the honest participant attempts to unfreeze some funds. Technically, this is implemented in item (1) of Definition 2, requiring that the moves <sup>1</sup> ...<sup>n</sup> are performed atomically. Atomicity might be realistic in some settings, but not in others. For instance, in Ethereum a sequence <sup>1</sup> ...<sup>n</sup> of method calls can be performed atomically: this requires to deploy a new contract with a suitable method which performs the calls <sup>1</sup> ...<sup>n</sup> in sequence, and then to invoke it. BitML, instead, does not allow participants to perform an atomic sequence of moves: an honest participant could start to perform the sequence, but at some point in the middle the adversary interferes. To make the contract liquid, the honest participant must still have a way to unfreeze the funds from the contract. Of course, the adversary could interfere once again, and so on. This could lead to an infinite trace where each attempt by the honest player is hindered by the adversary. However, this is not an issue in BitML, for the following reason. Since the moves <sup>1</sup> ...<sup>n</sup> make the contract terminate, we can safely assume that each of these moves makes the contract progress (as moves which do not affect the contract can be avoided). Since a BitML contract can not progress forever without terminating (and unfreezing its funds), the honest participant just needs to be able to make a step at a time (with possible interferences by the adversary, which may affect the choice of the next step). Defining liquidity beyond BitML and Ethereum may require to rule out unfair runs, where the adversary prevents honest participants to perform the needed sequences of moves.

### **4 A Finite-State Semantics of BitML**

The concrete BitML semantics is infinite-state because participants can always create new contracts and deposits, and can advance the current time (a natural number). In this section we introduce an abstract semantics for BitML, which focuses on both these features so to reduce the state space to a finite one. More specifically, for a concrete configuration <sup>Γ</sup> <sup>|</sup> <sup>t</sup>:


$$\alpha\_{X,Z}(\langle C,v\rangle\_x) = \begin{cases} \langle C,v\rangle\_x & \text{if } x \in X\\ 0 & \text{otherwise} \end{cases} \quad \alpha\_{X,Z}(\{\mathsf{A}:a\#N\}) = \begin{cases} \{\mathsf{A}:a\#N\} & \text{if } a \in Z\\ 0 & \text{otherwise} \end{cases}$$

$$\alpha\_{X,Z}(\langle \mathsf{A},v\rangle\_x) = \begin{cases} \langle \mathsf{A},v\rangle\_x & \text{if } x \in Z\\ 0 & \text{otherwise} \end{cases} \quad \alpha\_{X,Z}(\mathsf{A}:a\#N) = \begin{cases} \mathsf{A}:a\#N & \text{if } a \in Z\\ 0 & \text{otherwise} \end{cases}$$

$$\alpha\_{X,Z}(\mathsf{A}[\chi]) = \begin{cases} \mathsf{A}[\chi] & \text{if } \chi = x \upharpoonright D \text{ and } x \in X\\ \mathsf{A}[x,0 \rhd \mathsf{y}^\*] & \text{if } \chi = x \upharpoonright B \text{ and } x \in Z\\ 0 & \text{otherwise} \end{cases}$$

$$\alpha\_{X,Z}(\{G\}C) = 0 \qquad \alpha x, z(\Delta \mid \Delta^\prime) = \alpha\_{X,Z}(\Delta) \mid \alpha\_{X,Z}(\Delta^\prime)$$

**Fig. 4.** Abstraction of configurations.

We start by defining the abstraction of configurations.

**Definition 3 (Abstraction of configurations).** *We define the function* αX,Z *on concrete configurations in Fig. 4, where* y *denotes a fixed name not present in any concrete configuration. We write* αX(Γ) *for* αX,N(X,Γ)(Γ)*, where:*

$$\mathcal{N}(X,\varGamma) = \{ z \, | \, \exists x, C, v, \varGamma': \varGamma = \langle C, v \rangle\_x \mid \varGamma' \,\, \land \, x \in X \,\, \land \, z \in \mathrm{dn}(C) \cup \mathrm{sn}(C) \}$$

*where we denote with* dn(C) *the set of deposit names in some* put *within* C*, and with* sn(C) *the set of secrets names in some* reveal *within* C*.*

The abstraction removes from Γ all the deposits not in Z, all the (committed or revealed) secrets not in Z, and all the authorizations enabling branches of some contracts not in Z. All the other authorizations—but the deposit authorizations, which are handled in a special way—are removed. This is because, in the concrete semantics, deposits move into fresh ones which are no longer relevant for the contracts X. Note that if we precisely tracked such irrelevant deposits and their authorizations, our abstract semantics would become infinitestate. To cope with this issue, the abstract semantics will render deposit moves as "destroy" moves, removing the now irrelevant deposits from the configuration. As anticipated in Sect. 2.2, an authorization of a deposit move can only be performed after a "self-donate" authorization A[x - A], which lets A transfer the funds in x to another of her deposits. Our abstraction maps such A[x - A] into an "abstract destroy" authorization A[x, 0 y]. In this way, in abstract configurations, deposits can be destroyed when, in concrete configurations, they are no longer relevant.

The abstraction of time α<sup>T</sup> is parameterised over a finite set of naturals T, which partitions N into a finite set of non-overlapping intervals<sup>3</sup>. Each time t is abstracted as αT(t), which is the unique interval containing t.

<sup>3</sup> A specific choice of T, which considers all the deadlines in the contracts *X* under observation, is defined later on (Definition 8).

**Definition 4 (Abstraction of time).** *Let* <sup>T</sup> <sup>∈</sup> <sup>℘</sup>*fin*(N)*. We define the function* <sup>α</sup><sup>T</sup> : <sup>N</sup> <sup>→</sup> <sup>℘</sup>(N) *as* <sup>α</sup>T(t)=[t0, t1) *where:*

$$t\_0 = \max\left(\{t' \in \mathcal{T} \mid t' \le t\} \cup \{0\}\right) \quad t\_1 = \min\left(\{t' \in \mathcal{T} \mid t' > t\_0\} \cup \{+\infty\}\right)$$

**Lemma 1.** *If* <sup>T</sup> <sup>∈</sup> <sup>℘</sup>*fin*(N)*, then: (i)* <sup>∀</sup><sup>t</sup> <sup>∈</sup> <sup>N</sup> : <sup>t</sup> <sup>∈</sup> <sup>α</sup>T(t)*; (ii)* ran <sup>α</sup><sup>T</sup> *is finite.*

**Abstract Semantics.** We now describe the abstract semantics of BitML (the detailed formalisation is deferred to Definition 7 in Appendix A). An *abstract configuration* is a term of the form <sup>Γ</sup> <sup>|</sup> <sup>T</sup>, where <sup>Γ</sup> is a concrete untimed configuration, and <sup>T</sup> <sup>∈</sup> ran <sup>α</sup>T. We then define the relation <sup>→</sup> between abstract configurations by differences w.r.t. the concrete relation −→:


**Abstract Runs.** Given an arbitrary abstract configuration <sup>Γ</sup><sup>0</sup> <sup>|</sup> <sup>T</sup>0, an *abstract run* <sup>R</sup> is a (possibly infinite) sequence <sup>Γ</sup><sup>0</sup> <sup>|</sup> <sup>T</sup><sup>0</sup> <sup>→</sup> <sup>Γ</sup><sup>1</sup> <sup>|</sup> <sup>T</sup><sup>1</sup> <sup>→</sup> ··· . While concrete runs always start (at time 0) from configurations which contain only deposits, abstract runs can start from arbitrary configurations.

**Abstract Strategies.** An *abstract strategy* Σ# <sup>A</sup> is a PPTIME algorithm which allows A to select which actions to perform, among those permitted by the abstract semantics. Conformance between abstract runs and strategies is defined similarly to the concrete case [14].

**Concretisation of Strategies.** Each abstract strategy Σ# <sup>A</sup> can be transformed into a concrete strategy Σ<sup>A</sup> = γ(Σ# <sup>A</sup> ) as follows. The transformation is parameterised over a concrete run <sup>R</sup><sup>0</sup> and a set of contract names <sup>X</sup><sup>0</sup> <sup>⊆</sup> cn(Γ<sup>R</sup><sup>0</sup> ): intuitively, R<sup>0</sup> is the concrete counterpart of the initial abstract configuration <sup>Γ</sup><sup>0</sup> <sup>|</sup> <sup>T</sup>0, and <sup>X</sup><sup>0</sup> is the set of contracts under observation. The strategy <sup>Σ</sup><sup>A</sup> receives as input a concrete run R, and it must output the next actions. If R is a prefix of R0, the next move is chosen as in R0. The case where R is not an extension of R<sup>0</sup> is immaterial. Assuming that R extends R0, we first abstract the part of R exceeding R0, so to obtain an abstract run R. This is done by abstracting every configuration in the run: times are abstracted with α<sup>T</sup><sup>0</sup> , while untimed configurations are abstracted with αX, where X is the set of the descendants of X<sup>0</sup> in the configuration at hand. The moves of R are mapped to abstract moves in a natural way: moves not affecting the descendents of X0, nor their relevant deposits or secrets, are not represented in the abstract run. Once the abstract run R has been constructed, we apply Σ# <sup>A</sup> (R) to obtain the next abstract actions. Σ<sup>A</sup> (R) is defined as the concretisation of these actions. The concretisation of the adversary strategy Σ# Adv can be defined in a similar way.

**Theorem 1.** *Starting from any abstract configuration, the relation* → *is finitely branching, and it admits a finite number of runs.*

A direct consequence of Theorem 1 is that the abstract semantics is *finitestate*, and that *each abstract run is finite*. This makes the abstract LTS amenable to model checking.

**Correspondence Between the Semantics.** We now establish a correspondence between the abstract and the concrete semantics of BitML. Assume that we have a concrete run R0, representing the computation done so far. We want to observe the behaviour of a set of contracts X<sup>0</sup> in Γ<sup>R</sup><sup>0</sup> (the last untimed configuration of R0). To this purpose, we run the abstract semantics, starting from an initial configuration Γ <sup>0</sup> , whose untimed component is α<sup>X</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> ). The time component is obtained by abstracting the last time δ<sup>R</sup><sup>0</sup> in the concrete run. The parameter T<sup>0</sup> used to abstract time is any finite superset of the deadlines occurring in contracts X<sup>0</sup> within Γ<sup>R</sup><sup>0</sup> . Hereafter we denote this set of deadlines as *ticks*<sup>X</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> ) (see Definition 8 in Appendix A).

When the contracts in X<sup>0</sup> evolve, the run R<sup>0</sup> is extended to a run R, which contains the descendents of X0, i.e. those contracts whose *origin* belongs to X0. These descendents are denoted with *desc*<sup>R</sup><sup>0</sup> (R, X0).

**Definition 5.** *For all concrete runs* R0, R *such that* R *extends* R0*, and set of deposit names* X0*, we define the set of deposit names desc*<sup>R</sup><sup>0</sup> (R, X0) *as follows:*

*desc*<sup>R</sup><sup>0</sup> (R, X0) = x <sup>∃</sup>Γ , C, v : <sup>Γ</sup><sup>R</sup> <sup>=</sup> C, v<sup>x</sup> <sup>|</sup> <sup>Γ</sup> *and orig*<sup>R</sup><sup>0</sup> (R, x) <sup>∈</sup> <sup>X</sup><sup>0</sup> 

The following theorem states that the abstract semantics is a sound approximation of the concrete one. Every abstract run (conforming to A's abstract strategy Σ# <sup>A</sup> ) has a corresponding concrete run (conforming to the concrete strategy derived from Σ# <sup>A</sup> ). More precisely, each configuration <sup>Γ</sup> <sup>|</sup> <sup>T</sup> in the abstract run has a corresponding configuration in the concrete run, containing the concretization Γ of Γ, besides a term Δ containing the parts unrelated to X0. Further, each move in the abstract run corresponds to an analogous move in the concrete run.

**Theorem 2 (Soundness).** *Let* <sup>R</sup><sup>0</sup> *be a concrete run, let* <sup>X</sup><sup>0</sup> <sup>⊆</sup> cn(Γ<sup>R</sup><sup>0</sup> )*, let* <sup>Z</sup><sup>0</sup> <sup>⊇</sup> <sup>N</sup>(X0, Γ<sup>R</sup><sup>0</sup> )*, let* <sup>T</sup><sup>0</sup> <sup>∈</sup> <sup>℘</sup>*fin*(N)*, let* <sup>Γ</sup> <sup>0</sup> <sup>=</sup> <sup>α</sup><sup>X</sup>0,Z<sup>0</sup> (Γ<sup>R</sup><sup>0</sup> ) <sup>|</sup> <sup>α</sup><sup>T</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> )*. Let* <sup>Σ</sup># A *and* Σ# Adv *be the abstract strategies of* <sup>A</sup> *and of* Adv*, and let* <sup>Σ</sup><sup>A</sup> <sup>=</sup> <sup>γ</sup>(Σ# <sup>A</sup> ) *and* ΣAdv = γ(Σ# Adv ) *be the corresponding concrete strategies. For each abstract run* Γ <sup>0</sup> →<sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>T</sup> *conforming to* <sup>Σ</sup># <sup>A</sup> *and* <sup>Σ</sup># Adv *, there exists a concrete run:*

$$\mathcal{R}\_- = \begin{array}{c} \mathcal{R}\_0 \ \to^\* T \mid \Delta \mid \min T^\* \end{array}$$

*such that: (i)* R *conforms to* Σ<sup>A</sup> *and* ΣAdv *; (ii)* Δ *contains all the subterms of* ΓR<sup>0</sup> *which are mapped to* <sup>0</sup> *when evaluating* <sup>α</sup>X0,Z<sup>0</sup> (ΓR<sup>0</sup> )*; (iii)* <sup>α</sup>X,Z<sup>0</sup> (<sup>Γ</sup> <sup>|</sup> <sup>Δ</sup>) = <sup>Γ</sup>*, where* X = *desc*R<sup>0</sup> (R, X0)*; (iv)* αT<sup>0</sup> (min T) = T*; (v) the labels in* R *are the same as in* R*, except for the occurrences of* y*.*

Note that soundness only guarantees the existence of some concrete runs, which are a strict subset of all the possible concrete runs. For instance, the concrete semantics also allows the non-observed part Δ to progress, and it contains configurations with a time <sup>t</sup> = min T, for any T in any abstract run. Still, these concrete runs have an abstract counterpart, as established by the following completeness result (Theorem 3). This is almost dual to our soundness result (Theorem 2). Completeness maps concrete configurations to abstract ones using our abstraction functions for untimed configurations and time. Moreover, this run correspondence holds when the concrete strategy of A is derived from an abstract strategy, while no such restriction is required for the adversary strategy.

**Theorem 3 (Completeness).** *Let* <sup>R</sup><sup>0</sup> *be a concrete run, let* <sup>X</sup><sup>0</sup> <sup>⊆</sup> cn(Γ<sup>R</sup><sup>0</sup> )*, let* <sup>Z</sup><sup>0</sup> <sup>⊇</sup> <sup>N</sup>(X0, Γ<sup>R</sup><sup>0</sup> )*, let* <sup>T</sup><sup>0</sup> <sup>⊇</sup> *ticks*<sup>X</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> )*, and let* <sup>Γ</sup> <sup>0</sup> <sup>=</sup> <sup>α</sup><sup>X</sup>0,Z<sup>0</sup> (Γ<sup>R</sup><sup>0</sup> ) <sup>|</sup> <sup>α</sup><sup>T</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> )*. Let* Σ# <sup>A</sup> *be the abstract strategy of* <sup>A</sup>*, and let* <sup>Σ</sup><sup>A</sup> <sup>=</sup> <sup>γ</sup>(Σ# <sup>A</sup> ) *be the corresponding concrete strategy. For each concrete run* <sup>R</sup> <sup>=</sup> <sup>R</sup><sup>0</sup> <sup>→</sup><sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>t</sup> *conforming to* <sup>Σ</sup><sup>A</sup> *and to some* ΣAdv *, there exists an abstract run:*

$$\mathcal{R}^\sharp = \begin{array}{c} \varGamma\_0^\sharp \ \to\_\sharp^\* \ \alpha\_{X, Z\_0}(\varGamma) \mid \alpha\_{\mathcal{T}\_0}(t) \end{array}$$

*such that: (i)* R *conforms to* Σ# <sup>A</sup> *and to some* <sup>Σ</sup># Adv *; (ii)* <sup>X</sup> <sup>=</sup> *desc*<sup>R</sup><sup>0</sup> (R, X0)*; (iii) if* <sup>R</sup> <sup>=</sup> <sup>R</sup><sup>0</sup> −→<sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>t</sup> - −→··· *and* <sup>∈</sup> <sup>Σ</sup><sup>A</sup> (R<sup>0</sup> −→<sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>t</sup> )*, then there exists such that* R = Γ <sup>0</sup> →<sup>∗</sup> <sup>Γ</sup> <sup>=</sup> <sup>α</sup><sup>X</sup>-,Z<sup>0</sup> (Γ ) <sup>|</sup> <sup>α</sup><sup>T</sup><sup>0</sup> (<sup>t</sup> ) -- −→ ··· *where* <sup>∈</sup> <sup>Σ</sup># <sup>A</sup> (Γ <sup>0</sup> →<sup>∗</sup> <sup>Γ</sup>) *and* <sup>X</sup> <sup>=</sup> *desc*<sup>R</sup><sup>0</sup> (R<sup>0</sup> −→<sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>t</sup> , X0)*.*

*Example 5.* Let C = reveal a.withdraw A + put y.withdraw B, and let R be the following concrete run, where the prefix ··· is immaterial (for simplicity, we also omit labels, times, and participants' strategies):

···−→ C, 1B*<sup>x</sup>* | B, 1B*<sup>y</sup>* | A, 2B*<sup>z</sup>* | {A : a#10} = Γ<sup>0</sup> −→ C, 1B*<sup>x</sup>* | B, 1B*<sup>y</sup>* | A, 2B*<sup>z</sup>* | {A : a#10} | B[y - B] −→ C, 1B*<sup>x</sup>* | B, 1B*<sup>y</sup>* | A, 2B*<sup>z</sup>* | {A : a#10} | B[y - B] | B[y - C] −→ C, 1B*<sup>x</sup>* | B, 1B*<sup>y</sup>* | A, 2B*<sup>z</sup>* | A : a#10 | B[y - B] | B[y - C] −→ withdraw <sup>A</sup>, 1B*x*- | B, 1B*<sup>y</sup>* | A, 2B*<sup>z</sup>* | A : a#10 | B[y - B] | B[y - C] = Γ −→ withdraw <sup>A</sup>, 1B*x*- | C, 1B*y*- | A, 2B*<sup>z</sup>* | A : a#10 | B[y - B] | B[y - C] −→ A, 1B*x*-- | C, 1B*y*- | A, 2B*<sup>z</sup>* | A : a#10 | B[y - B] | B[y -C]

By Theorem 3, this concrete run has the following corresponding abstract run w.r.t. <sup>X</sup><sup>0</sup> <sup>=</sup> {x}. The initial configuration <sup>Γ</sup><sup>0</sup> is abstracted w.r.t. <sup>X</sup><sup>0</sup> and <sup>Z</sup><sup>0</sup> <sup>=</sup> <sup>N</sup>(X0, Γ0) = {a,y}. This causes deposit <sup>z</sup> to be neglected in the abstraction.

$$\begin{aligned} \langle C, \mathbf{1}\mathfrak{B}\rangle\_x \mid \langle \mathsf{B}, \mathbf{1}\mathfrak{B}\rangle\_y \mid \{\mathsf{A} : a\#\mathbf{1}0\} &= I\_0^\sharp\\ \longrightarrow\_\sharp \langle C, \mathbf{1}\mathfrak{B}\rangle\_x \mid \langle \mathsf{B}, \mathbf{1}\mathfrak{B}\rangle\_y \mid \{\mathsf{A} : a\#\mathbf{1}0\} \mid \mathsf{B}[y, \mathbf{0} \rhd y^\star] \\ \longrightarrow\_\sharp \langle C, \mathbf{1}\mathfrak{B}\rangle\_x \mid \langle \mathsf{B}, \mathbf{1}\mathfrak{B}\rangle\_y \mid \mathsf{A} : a\#\mathbf{1}0 \mid \mathsf{B}[y, \mathbf{0} \rhd y^\star] \\ \longrightarrow\_\sharp \langle \mathsf{with} \mathbf{r} \mathbf{a} \wedge, \mathbf{1}\mathfrak{B}\rangle\_{x'} \mid \langle \mathsf{B}, \mathbf{1}\mathfrak{B}\rangle\_y \mid \mathsf{A} : a\#\mathbf{1}0 \mid \mathsf{B}[y, \mathbf{0} \rhd y^\star] \mid = I^\sharp\\ \longrightarrow\_\sharp \langle \mathsf{with} \mathbf{r} \mathbf{a} \wedge, \mathbf{1}\mathfrak{B}\rangle\_{x'} \mid \mathsf{A} : a\#\mathbf{1}0 \\ \longrightarrow\_\sharp \mathsf{A} : a\#\mathbf{1}0 \end{aligned}$$

We now compare the two runs. The concrete authorization for a self-donate of y is abstracted as an authorization for destroying y. Instead, the concrete authorization for donating y to C has no abstract counterpart. The concrete reveal of secret a and the subsequent contract move have identical abstract moves, which reach the abstract configuration Γ. Technically, Γ is the result of abstracting the concrete configuration <sup>Γ</sup> w.r.t. <sup>X</sup> <sup>=</sup> {x } and <sup>Z</sup>0: here, we no longer abstract w.r.t. X0, but instead use the set of its descendents X . By contrast, the set Z<sup>0</sup> is unchanged. Note that, if we instead abstracted with respect to X0, we would discard the contract x , in which case we could not perform the abstract step, because the abstract semantics does not discard x . Similarly, if we instead used Z = N(X , Γ) = <sup>∅</sup> we would discard the secret <sup>a</sup> and the deposit y, invalidating the abstract steps. When Γ performs the next move (a donation) this is abstracted as a destroy move. Finally, the last concrete withdraw move is mapped to an abstract withdraw move, which does not create the deposit x.

### **5 Verifying Liquidity**

In this section we devise a verification technique for liquidity of BitML contracts, exploiting our abstract semantics. The first step is to give an abstract counterpart of liquidity: this is done in Definition 6, which mimics Definition 2, replacing concrete objects with abstract ones.

**Definition 6 (Abstract liquidity).** *Let* A *be an honest participant, with abstract strategy* Σ# <sup>A</sup> *, let* <sup>R</sup> <sup>0</sup> *be an abstract run, and let* X<sup>0</sup> *be a set of contract names in* <sup>Γ</sup>R- 0 *. We say that* X<sup>0</sup> *is* -liquid w.r.t. Σ# <sup>A</sup> in <sup>R</sup> <sup>0</sup> *if for all extensions* R *of* R <sup>0</sup> *conforming to* <sup>Σ</sup># <sup>A</sup> *and to some* <sup>Σ</sup># Adv *, there exists an extension* R˙ = R -<sup>1</sup> −→··· <sup>n</sup> −→ *of* <sup>R</sup> *such that:*

$$\forall i \in 1..n: \ell\_i \in \Sigma\_{\mathsf{A}}^{\#}(\mathscr{R}^{\sharp} \xrightarrow{\ell\_1} \dots \xrightarrow{\ell\_{i-1}} \mathfrak{z})\tag{3}$$

$$x \in \operatorname{cn}(\Gamma\_{\dot{\mathcal{R}}^{\sharp}}) \implies \operatorname{orig}\_{\mathcal{R}\_0^{\sharp}}(\dot{\mathcal{R}}^{\sharp}, x) \notin X\_0 \tag{4}$$

To verify liquidity of a set of contracts X<sup>0</sup> in a concrete run R0, we will choose R <sup>0</sup> to be the run containing a single configuration <sup>Γ</sup> <sup>0</sup> , obtained by abstracting with α<sup>X</sup><sup>0</sup> the last configuration of R0. In such case, the condition (4) above can be simplified by just requiring that cn(ΓR˙-) = ∅.

The following lemma states that abstract and concrete liquidity are equivalent. For this, it suffices that the abstraction is performed with respect to the contract names X0, and to the set of deadlines occurring in the contracts X0.

**Lemma 2 (Abstract vs. concrete liquidity).** *Let* R<sup>0</sup> *be a concrete run, let* <sup>X</sup><sup>0</sup> <sup>⊆</sup> cn(ΓR<sup>0</sup> )*, and let* <sup>T</sup><sup>0</sup> <sup>=</sup> *ticks*X<sup>0</sup> (ΓR<sup>0</sup> )*. Let* <sup>Γ</sup> <sup>0</sup> <sup>=</sup> <sup>α</sup>X<sup>0</sup> (ΓR<sup>0</sup> ) <sup>|</sup> <sup>α</sup>T<sup>0</sup> (δR<sup>0</sup> )*. Let* Σ# <sup>A</sup> *be an abstract strategy (w.r.t.* <sup>T</sup><sup>0</sup> *and* <sup>Γ</sup> <sup>0</sup> *), and let* <sup>Σ</sup><sup>A</sup> <sup>=</sup> <sup>γ</sup>R<sup>0</sup> (Σ# <sup>A</sup> )*. Let* R <sup>0</sup> <sup>=</sup> <sup>Γ</sup> <sup>0</sup> *(i.e., the run with no moves). Then:*

> <sup>X</sup><sup>0</sup> is liquid w.r.t. Σ<sup>A</sup> in <sup>R</sup><sup>0</sup> ⇐⇒ <sup>X</sup><sup>0</sup> is *-*liquid w.r.t. Σ# <sup>A</sup> in <sup>R</sup> 0.

The following lemma states that if a contract is liquid w.r.t. some concrete strategy, then is also liquid w.r.t. some abstract strategy, and *vice versa*. Intuitively, this holds since if it is possible to make a contract evolve with a sequence of moves conforming to any concrete strategy, then the same moves can be also be generated by an abstract strategy.

**Lemma 3.** *Let* <sup>R</sup><sup>0</sup> *be a concrete run, and let* <sup>X</sup><sup>0</sup> <sup>⊆</sup> cn(Γ<sup>R</sup><sup>0</sup> )*.* <sup>X</sup><sup>0</sup> *is liquid w.r.t. some* Σ<sup>A</sup> *in* R<sup>0</sup> *iff* X<sup>0</sup> *is liquid w.r.t.* γ(Σ# <sup>A</sup> ) *in* <sup>R</sup>0*, for some* <sup>Σ</sup># A *.*

Our main technical result follows. It states that liquidity is decidable, and that it is possible to automatically infer liquid strategies for a given contract.

**Theorem 4 (Decidability of liquidity).** *Liquidity is decidable. Furthermore, for any* R<sup>0</sup> *and* X0*, it is decidable whether there exists a strategy* Σ<sup>A</sup> *such that* X<sup>0</sup> *is liquid w.r.t.* Σ<sup>A</sup> *in* R0*. If such strategy exists, then it can be automatically inferred given* R<sup>0</sup> *and* X0*.*

*Proof.* Let A be an honest participant with strategy Σ<sup>A</sup> , let R<sup>0</sup> be a concrete run, and let X<sup>0</sup> be a set of contract names in Γ<sup>R</sup><sup>0</sup> . By Lemma 3, X<sup>0</sup> is liquid w.r.t. Σ<sup>A</sup> iff there exists some abstract strategy Σ# <sup>A</sup> such that <sup>X</sup><sup>0</sup> is liquid w.r.t. Σ <sup>A</sup> <sup>=</sup> <sup>γ</sup>(Σ# <sup>A</sup> ). By Lemma 2, <sup>X</sup><sup>0</sup> is liquid w.r.t. <sup>Σ</sup> <sup>A</sup> iff <sup>X</sup><sup>0</sup> is -liquid w.r.t. <sup>Σ</sup># A . By Theorem 1, the abstract semantics is finite, and so the possible abstract strategies are finite. Therefore, -liquidity is decidable, and consequently also liquidity is decidable. Note that this procedure also finds a liquid strategy, if there exists one.

### **6 Conclusions**

We have developed a theory of liquidity for smart contracts, and a verification technique which is sound and complete for contracts expressed in BitML. Our finite-state abstraction can be applied, besides liquidity, to verify other properties of smart contracts. For instance, we could decide whether a strategy allows a participant to always terminate a contract within a certain deadline. Additionally, we could infer a strategy which guarantees that the contract terminates before a certain time (if any such strategy exists), or infer the strategy that terminates in the shortest time, etc. Although our theory is focussed on BitML, the various notions of liquidity we have proposed could be applied to more expressive languages for smart contracts, like e.g. Solidity (the high-level language used by Ethereum). To the best of our knowledge, the only form of liquidity verified so far in Ethereum is the "strategyless multiparty" variant, which only requires the existence of a cooperative strategy to unfreeze funds (this property is analysed, e.g., by the Securify tool [35]). Since Ethereum contracts are Turing-powerful, verifying their liquidity is not possible in a sound and complete manner; instead, the reduced expressiveness of BitML makes liquidity decidable in that setting.

**Acknowledgements.** Massimo Bartoletti is partially supported by Aut. Reg. of Sardinia projects *Sardcoin* and *Smart collaborative engineering*. Roberto Zunino is partially supported by MIUR PON *Distributed Ledgers for Secure Open Communities*.

### **A Appendix**

**Lemma 4.** *Let* R0, R1, R<sup>2</sup> *be such that* R<sup>1</sup> *extends* R<sup>0</sup> *and* R<sup>2</sup> *extends* R1*. Then:*

$$orig\_{\mathcal{R}\_0}(\mathcal{R}\_1, \operatorname{orig}\_{\mathcal{R}\_1}(\mathcal{R}\_2, x)) = \operatorname{orig}\_{\mathcal{R}\_0}(\mathcal{R}\_2, x).$$

**Proof of Lemma** 4 **(sketch).** By induction on R1.

**Definition 7 (Abstract semantics).** *Let* <sup>T</sup> <sup>∈</sup> <sup>℘</sup>*fin*(N)*. An* abstract configuration *is a term of the form* <sup>Γ</sup> <sup>|</sup> <sup>T</sup>*, where* <sup>Γ</sup> *is a concrete untimed configuration, and* <sup>T</sup> <sup>∈</sup> ran <sup>α</sup>T*. We then define the relation* <sup>→</sup> *between abstract configurations by differences w.r.t. the concrete relation* −→*:*


A, v<sup>x</sup> <sup>|</sup> <sup>Γ</sup> <sup>A</sup>:x,0,y −−−−−→ A, v<sup>x</sup> <sup>|</sup> <sup>A</sup>[x, <sup>0</sup> <sup>y</sup>] <sup>|</sup> <sup>Γ</sup> [Dep-AbsAuthDestroy]

A, v<sup>x</sup> <sup>|</sup> <sup>A</sup>[x, <sup>0</sup> <sup>y</sup>] <sup>|</sup> <sup>Γ</sup> *destroy*(x,y) −−−−−−−−→ <sup>Γ</sup> [Dep-AbsDestroy]

*3. the rule* [Delay] *is replaced by the following:*

$$\frac{\delta = \min T' - \min T > 0}{\Gamma \mid T \xrightarrow{\delta}\_{\sharp} \Gamma \mid T'}$$

*4. the rule* [C-Withdraw] *is replaced by the following:*

$$\begin{array}{|c|c|}\hline \text{ $\langle \mathtt{withr law \AA}, v \rangle\_y$ } & \Gamma \xrightarrow{\text{ $ withdraw(\langle \mathtt{A}, v, y \rangle)$ }} \text{ $\langle \mathtt{C}-\mathtt{hasWirnna}, \mathtt{aw} \rangle$ }\\\hline \end{array}$$

*5. the rule* [Timeout] *is replaced by the following:*

*untimed configurations to* ℘*fin*(N) *as follows:*

$$D \equiv \texttt{after}\ t\_1 : \dots : \texttt{after}\ t\_m : D' \quad D' \not\equiv \texttt{after}\ t' : \dots$$

$$\frac{\langle D, v \rangle\_x \mid \Gamma \stackrel{\ell}{\to}\_\sharp \Gamma' \quad x \in cv(\ell) \qquad \begin{array}{l} \min T \geq t\_1, \dots, t\_m \\ \hline \end{array}\_{[\text{Aus}(\text{Tnaov})]}}{\langle D + C, v \rangle\_x \mid \Gamma \mid T \stackrel{\ell}{\to}\_\sharp \Gamma' \mid T \end{array}\_{[\text{Aus}(\text{Tnaov})]}$$

**Definition 8.** *We define the function ticks from contracts to* ℘*fin*(N) *as follows:*

*ticks* ( <sup>i</sup>∈<sup>I</sup> <sup>D</sup>i) = i∈I *ticks* (Di) *ticks* (A : D) = *ticks* (D) *ticks* (withdraw <sup>A</sup>) = <sup>∅</sup> *ticks* (after <sup>t</sup> : <sup>D</sup>) = {t} ∪ *ticks* (D) *ticks* (split *<sup>v</sup>* <sup>→</sup> *<sup>C</sup>*) = *ticks* (*C*) *ticks* (put *<sup>x</sup>* & reveal *<sup>a</sup>* if p. <sup>C</sup>) = *ticks* (C) *Then, for any set of names* X*, we define the function ticks*<sup>X</sup> *from concrete*

*ticks*X({G}C) = <sup>∅</sup> *ticks*X(C, vx) = *ticks* (C) *if* <sup>x</sup> <sup>∈</sup> <sup>X</sup> ∅ *otherwise ticks*X(A, v) = *ticks*X(A[χ]) = *ticks*X({<sup>A</sup> : <sup>a</sup>#N}) = *ticks*X(<sup>A</sup> : <sup>a</sup>#N) = <sup>∅</sup> *ticks*X(<sup>Γ</sup> <sup>|</sup> <sup>Γ</sup> ) = *ticks*X(Γ) <sup>∪</sup> *ticks*X(Γ )

**Lemma 5.** *If* <sup>R</sup> <sup>=</sup> <sup>R</sup><sup>0</sup> −→<sup>∗</sup> <sup>Γ</sup> <sup>|</sup> <sup>t</sup>*, then ticks*<sup>X</sup><sup>0</sup> (Γ<sup>R</sup><sup>0</sup> ) <sup>⊇</sup> *ticksdesc*R<sup>0</sup> (R,X0)(Γ)*.*

**Proof of Lemma** 5 **(sketch).** When a move is performed, a contract becomes syntactically smaller, hence the set of deposit names and secret names within the contract becomes a subset.

**Definition 9 (Abstract strategies).** *For any* <sup>T</sup> <sup>∈</sup> <sup>℘</sup>*fin*(N) *and initial abstract configuration* <sup>Γ</sup><sup>0</sup> <sup>|</sup> <sup>T</sup><sup>0</sup> *with* <sup>T</sup><sup>0</sup> <sup>∈</sup> ran <sup>α</sup>T*, we define an* abstract strategy <sup>Σ</sup># A *as a PPTIME algorithm which takes as input an abstract run starting from* <sup>Γ</sup><sup>0</sup> <sup>|</sup> <sup>T</sup><sup>0</sup> *and a randomness source, and gives as output a finite sequence of actions. Abstract strategies are subject to same constraints imposed to concrete ones.*

Note that, since Σ# <sup>A</sup> can only output moves according to the abstract semantics, it can only choose delays δ which jump from an interval T to a subsequent interval T , i.e. <sup>δ</sup> = min <sup>T</sup> <sup>−</sup> min <sup>T</sup>.

**Proof of Theorem** 1 **(sketch).** The theorem immediately follows from the definition of our abstract semantics, which, compared to the concrete semantics, removes or abstracts all the BitML rules which can violate the statement. More precisely, using rule induction we observe that each abstract step makes the configuration syntactically "smaller", ensuring termination. Further, we have a finite amount of rules, and each rule can only cause a finite amount of branches.

**Proof of Theorem** 2 **(sketch).** Essentially, the concrete run can perform the same moves of the abstract run, with the following minor changes. The abstract rules for destroying deposits (and the related authorizations) involve the name y, which are replaced by fresh names y in the concrete run. Further, abstract delay moves change the abstract time T to T : in the concrete run, instead, we make time move from min T to min T . This makes the concrete and abstract timeout rules to agree on which branches after t : D are enabled.

**Proof of Theorem** 3 **(sketch).** Each concrete move corresponds to zero or more abstract moves: in the latter case, the concrete and abstract moves are related as follows: (i) contract moves are unchanged; (ii) all authorizations are unchanged, but for A : x,B (generated by [Dep-AuthDonate]) which is abstracted as A : x, 0, y; (iii) deposit moves affecting a set Y of deposits are transformed to a sequence of [Dep-AbsDestroy] moves, destroying those deposits in Y which are present in the abstract configuration; (iv) reveal moves are unchanged; (v) delay moves are mapped to delay moves (not necessarily of the same duration).

**Proof of Lemma** 2. See [15].

**Proof of Lemma** 3 **(sketch).** The lemma holds since Σ# <sup>A</sup> can be defined in terms of Σ<sup>A</sup> , in such a way to preserve the following invariant: each conforming run to Σ# <sup>A</sup> can be transformed into a concrete run conforming to <sup>Σ</sup><sup>A</sup> . Upon receiving a (conforming) abstract run, if some descendent of X<sup>0</sup> is still present, Σ# <sup>A</sup> computes a corresponding concrete run and queries <sup>Σ</sup># <sup>A</sup> with it, learning the next concrete moves. Since X<sup>0</sup> is liquid, the concrete strategy eventually must perform a move which is relevant for the contracts X0, and that move can then be chosen by Σ# <sup>A</sup> . If such move is then taken by the abstract adversary, the invariant is clearly preserved. If instead the adversary takes another move, we can extend the concrete run accordingly, and still preserve the invariant.

**Liquidity for Finite LTS.** We now give an alternative characterization of liquidity, which corresponds to Definition 2 on transition systems with finite traces, like the one obtained through the abstraction introduced in Sect. 4.

**Definition 10 (Maximal run).** *We say that a run* R *is maximal w.r.t. a set of strategies* **Σ** *when* R - −→ *implies* ∈ **Σ**(R)*.*

**Definition 11 (Liquidity for finite LTS).** *Assume that* A *is the only honest participant, with strategy* Σ# <sup>A</sup> *. We say that* <sup>X</sup><sup>0</sup> *is fin*-liquid w.r.t. <sup>Σ</sup># <sup>A</sup> in <sup>R</sup> 0 *when, for all extensions* R *of* R <sup>0</sup> *conforming to* <sup>Σ</sup># <sup>A</sup> *(and to some* <sup>Σ</sup># Adv *), if* R *is maximal w.r.t.* <sup>Σ</sup><sup>A</sup> , ΣAdv *and* <sup>x</sup> <sup>∈</sup> cn(ΓR- )*, then orig*R- 0 (R, x) <sup>∈</sup> <sup>X</sup>0*.*

**Lemma 6.** X<sup>0</sup> *is -liquid w.r.t.* Σ# <sup>A</sup> *in* <sup>R</sup> <sup>0</sup> *iff* <sup>X</sup><sup>0</sup> *is fin-liquid w.r.t.* <sup>Σ</sup># <sup>A</sup> *in* <sup>R</sup> 0*.*

*Proof.* For the "only if part", assume that X<sup>0</sup> is -liquid w.r.t. Σ# <sup>A</sup> in <sup>R</sup> <sup>0</sup>, and let R be a maximal extension (w.r.t. Σ# <sup>A</sup> , Σ# Adv ) of <sup>R</sup> <sup>0</sup> conforming to <sup>Σ</sup># <sup>A</sup> , Σ# Adv . By Definition 6, condition (3) can only hold for <sup>R</sup>˙ <sup>=</sup> <sup>R</sup>. Hence, for all <sup>x</sup> <sup>∈</sup> cn(ΓR- ), by condition (4) it follows that *orig*R- 0 (R, x) <sup>∈</sup> <sup>X</sup>0.

For the "if part", assume that X<sup>0</sup> is *fin*-liquid w.r.t. Σ# <sup>A</sup> in <sup>R</sup> <sup>0</sup>, and let R be an extension of R <sup>0</sup> conforming to <sup>Σ</sup># <sup>A</sup> , Σ# Adv . There are two cases:


### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Author Index

Alexander, Perry 197 Amidon, Peter 1 Antonopoulos, Timos 29 Askarov, Aslan 51 Aspinall, David 175 Bartoletti, Massimo 222 Butler, David 175 Chan, Matthew 1 Debant, Alexandre 149 Delaune, Stéphanie 149 Dras, Mark 123 Fernandes, Natasha 123 Gascón, Adrià 175 Gregersen, Simon 51 Helble, Sarah C. 197 Hicks, Michael 76, 99 Lampropoulos, Leonidas 76 Loscocco, Peter 197

McIver, Annabelle 123

Pendergrass, J. Aaron 197 Petz, Adam 197

Ramsdell, John D. 197 Rastogi, Aseem 99 Renner, John 1 Rowe, Paul D. 197 Ruef, Andrew 76

Soeller, Gary 1 Stefan, Deian 1 Swamy, Nikhil 99 Sweet, Ian 76

Tarditi, David 76 Terauchi, Tachio 29 Thomsen, Søren Eller 51

Vassena, Marco 1

Zunino, Roberto 222